文章基本信息

标题：Comparison of two methods for solving Markov Decision Processes in the persecution-evasion problem
作者：Michel Garc?a ; Cinhtia Gonz?lez ; Enrique Succar 等
期刊名称：International Journal of Computer Science and Network Security
印刷版ISSN：1738-7906
出版年度：2010
卷号：10
期号：4
页码：295-299
出版社：International Journal of Computer Science and Network Security
摘要：There are two basic approaches to solve Markov decision processes (MDP). One is to build a model of the process and to obtain the optimal policy using value or policy iteration. The other consists in obtaining the policy by trial and error using reinforcement learning. Although the two have been used to solve different decision problems, their merits and limitations have not been compared experimentally in the same domain. We have used both approaches to solve a pursuit-evasion problem in mobile robotics. We represent this problem as relational MDP, considering the distance and position of the evader in relation to the pursuer; and obtain the optimal policy for the pursuer by: (i) building a model and solving it with value iteration, and (ii) by using reinforcement learning. We have implemented both approaches in a simulated environment and compared them in terms of effectiveness, efficiency and ease of model construction.
关键词：MDP; Reinforcement Learning (RL); Value Iteration