首页    期刊浏览 2024年11月24日 星期日
登录注册

文章基本信息

  • 标题:Fine-tuning Deep RL with Gradient-Free Optimization ⁎
  • 本地全文:下载
  • 作者:Tim de Bruin ; Jens Kober ; Karl Tuyls
  • 期刊名称:IFAC PapersOnLine
  • 印刷版ISSN:2405-8963
  • 出版年度:2020
  • 卷号:53
  • 期号:2
  • 页码:8049-8056
  • DOI:10.1016/j.ifacol.2020.12.2240
  • 语种:English
  • 出版社:Elsevier
  • 摘要:AbstractDeep reinforcement learning makes it possible to train control policies that map high-dimensional observations to actions. These methods typically use gradient-based optimization techniques to enable relatively efficient learning, but are notoriously sensitive to hyperparameter choices and do not have good convergence properties. Gradient-free optimization methods, such as evolutionary strategies, can offer a more stable alternative but tend to be much less sample efficient. In this work we propose a combination, using the relative strengths of both. We start with a gradient-based initial training phase, which is used to quickly learn both a state representation and an initial policy. This phase is followed by a gradient-free optimization of only the final action selection parameters. This enables the policy to improve in a stable manner to a performance level not obtained by gradient-based optimization alone, using many fewer trials than methods using only gradient-free optimization. We demonstrate the effectiveness of the method on two Atari games, a continuous control benchmark and the CarRacing-v0 benchmark. On the latter we surpass the best previously reported score while using significantly fewer episodes.
  • 关键词:KeywordsReinforcement LearningDeep LearningOptimizationNeural NetworksControl
国家哲学社会科学文献中心版权所有