文章基本信息

标题：Combining Local and Global Direct Derivative-free Optimization for Reinforcement Learning
本地全文：下载
作者：M. Leonetti ; P. Kormushev ; S. Sagratella 等
期刊名称：Cybernetics and Information Technologies
印刷版ISSN：1311-9702
电子版ISSN：1314-4081
出版年度：2012
卷号：12
期号：3
出版社：Bulgarian Academy of Science
摘要：We consider the problem of optimization in policy space for reinforcement learning. While a plethora of methods have been applied to this problem, only a narrow category of them proved feasible in robotics. We consider the peculiar characteristics of reinforcement learning in robotics, and devise a combination of two algorithms from the literature of derivative-free optimization. The proposed combination is well suited for robotics, as it involves both off-line learning in simulation and on-line learning in the real environment. We demonstrate our approach on a real-world task, where an Autonomous Underwater Vehicle has to survey a target area under potentially unknown environment conditions. We start from a given controller, which can perform the task under foreseeable conditions, and make it adaptive to the actual environment.
关键词：Reinforcement learning; policy search; derivative-free optimization; robotics;autonomous underwater vehicles