文章基本信息

标题：Compound Reinforcement Learning
本地全文：下载
作者：Tohgoroh Matsui
期刊名称：人工知能学会論文誌
印刷版ISSN：1346-0714
电子版ISSN：1346-8030
出版年度：2011
卷号：26
期号：2
页码：330-334
DOI：10.1527/tjsai.26.330
出版社：The Japanese Society for Artificial Intelligence
摘要：This paper describes a reinforcement learning framework based on compound returns, which is called compound reinforcement learning. Compound reinforcement learning maximizes the compound return in returns-based MDPs. We also describe compound Q-learning algorithm. We present experimental results using an ilustrative example, 2-armed bandit.
关键词：reinforecement learning ; value functions ; compound returns ; Q-learning