首页    期刊浏览 2024年11月26日 星期二
登录注册

文章基本信息

  • 标题:An Extention of Profit Sharing to Partially Observable Markov Decision Processes : Proposition of PS-r* and its Evaluation
  • 本地全文:下载
  • 作者:Kazuteru Miyazaki ; Shigenobu Kobayashi
  • 期刊名称:人工知能学会論文誌
  • 印刷版ISSN:1346-0714
  • 电子版ISSN:1346-8030
  • 出版年度:2003
  • 卷号:18
  • 期号:5
  • 页码:286-296
  • DOI:10.1527/tjsai.18.286
  • 出版社:The Japanese Society for Artificial Intelligence
  • 摘要:We know the rationality theorem of Profit Sharing (PS) [Miyazaki 94, Miyazaki 99b] and the Rational Policy Making algorithm (RPM) [Miyazaki 99a] to guarantee the rationality in a typical class of Partially Observable Markov Decision Processes (POMDPs). In this paper, we focus on the whole class of POMDPs and propose PS-r that is an algorithm connected PS and RPM with random selection. In the first, we have analyzed the behavior of PS-r. We have derived that the maximum value of the step to get a reward by PS-r divided by that of random selection is $({\Large r\frac{(1+\frac{M-1}{r})^n}{M^n}})$ where $n$ is the maximum number of state that senses same state due to the agent's sensory limitation and M is the number of actions. Furthermore, we propose PS-r* that can improve the behavior of PS-r. Through numerical examples, we conform the effectiveness of PS-r*.
  • 关键词:reinforcement learning ; profit sharing ; rational policy making algorithm ; POMDPs ; theorem
国家哲学社会科学文献中心版权所有