首页    期刊浏览 2025年02月19日 星期三
登录注册

文章基本信息

  • 标题:Cross Entropy Optimization of Action Modification Policies for Continuous-Valued MDPs ⁎
  • 本地全文:下载
  • 作者:Kamelia Mirkamali ; Lucian Buşoniu
  • 期刊名称:IFAC PapersOnLine
  • 印刷版ISSN:2405-8963
  • 出版年度:2020
  • 卷号:53
  • 期号:2
  • 页码:8124-8129
  • DOI:10.1016/j.ifacol.2020.12.2292
  • 语种:English
  • 出版社:Elsevier
  • 摘要:AbstractWe propose an algorithm to search for parametrized policies in continuous state and action Markov Decision Processes (MDPs). The policies are represented via a number of basis functions, and the main novelty is that each basis function corresponds to a small, discrete modification of the continuous action. In each state, the policy chooses a discrete action modification associated with a basis function having the maximum value at the current state. Empirical returns from a representative set of initial states are estimated in simulations to evaluate the policies. Instead of using slow gradient-based algorithms, we apply cross entropy method for updating the parameters. The proposed algorithm is applied to a double integrator and an inverted pendulum problem, with encouraging results.
  • 关键词:KeywordsMarkov decision processespolicy searchcross-entropy optimizationcontinuous actions
国家哲学社会科学文献中心版权所有