首页    期刊浏览 2025年02月18日 星期二
登录注册

文章基本信息

  • 标题:Policy Derivation Methods for Critic-Only Reinforcement Learning in Continuous Action Spaces
  • 本地全文:下载
  • 作者:Eduard Alibekov ; Jiri Kubalik ; Robert Babuska
  • 期刊名称:IFAC PapersOnLine
  • 印刷版ISSN:2405-8963
  • 出版年度:2016
  • 卷号:49
  • 期号:5
  • 页码:285-290
  • DOI:10.1016/j.ifacol.2016.07.127
  • 语种:English
  • 出版社:Elsevier
  • 摘要:AbstractState-of-the-art critic-only reinforcement learning methods can deal with a small discrete action space. The most common approach to real-world problems with continuous actions is to discretize the action space. In this paper a method is proposed to derive a continuous-action policy based on a value function that has been computed for discrete actions by using any known algorithm such as value iteration. Several variants of the policy-derivation algorithm are introduced and compared on two continuous state-action benchmarks: double pendulum swing-up and 3D mountain car.
  • 关键词:Keywordsreinforcement learningcontinuous actionsmulti-variable systemsoptimal controlpolicy derivation
国家哲学社会科学文献中心版权所有