首页    期刊浏览 2024年11月25日 星期一
登录注册

文章基本信息

  • 标题:Trajectory-Based Modified Policy Iteration
  • 作者:R. Sharma, M. Gopal
  • 期刊名称:International Journal of Computer Systems Science and Engineering
  • 印刷版ISSN:1307-430X
  • 出版年度:2007
  • 卷号:03
  • 期号:01
  • 页码:27-27
  • 出版社:World Academy of Science, Engineering and Technology
  • 摘要:This paper presents a new problem solving approach that is able to generate optimal policy solution for finite-state stochastic sequential decision-making problems with high data efficiency. The proposed algorithm iteratively builds and improves an approximate Markov Decision Process (MDP) model along with cost-to-go value approximates by generating finite length trajectories through the state-space. The approach creates a synergy between an approximate evolving model and approximate cost-to-go values to produce a sequence of improving policies finally converging to the optimal policy through an intelligent and structured search of the policy space. The approach modifies the policy update step of the policy iteration so as to result in a speedy and stable convergence to the optimal policy. We apply the algorithm to a non-holonomic mobile robot control problem and compare its performance with other Reinforcement Learning (RL) approaches, e.g., a) Q-learning, b) Watkins Q(λ), c) SARSA(λ).
  • 关键词:Markov Decision Process (MDP), Mobile robot, Policy iteration, Simulation.
Loading...
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有