期刊名称:International Journal of Advanced Robotic Systems
印刷版ISSN:1729-8806
电子版ISSN:1729-8814
出版年度:2020
卷号:17
期号:2
页码:1-9
DOI:10.1177/1729881420916258
出版社:SAGE Publications
摘要:Reinforcement learning has been a promising approach in control and robotics since data-driven learning leads to non-necessity of engineering knowledge. However, it usually requires many interactions with environments to train a controller. This is a practical limitation in some real environments, for example, robots where interactions with environments are restricted and time inefficient. Thus, learning is generally conducted with a simulation environment, and after the learning, migration is performed to apply the learned policy to the real environment, but the differences between the simulation environment and the real environment, for example, friction coefficients at joints, changing loads, may cause undesired results on the migration. To solve this problem, most learning approaches concentrate on retraining, system or parameter identification, as well as adaptive policy training. In this article, we propose an approach where an adaptive policy is learned by extracting more information from the data. An environmental encoder, which indirectly reflects the parameters of an environment, is trained by explicitly incorporating model uncertainties into long-term planning and policy learning. This approach can identify the environment differences when migrating the learned policy to a real environment, thus increase the adaptability of the policy. Moreover, its applicability to autonomous learning in control tasks is also verified..