首页    期刊浏览 2024年11月08日 星期五
登录注册

文章基本信息

  • 标题:Improvements of the Penalty Avoiding Rational Policy Making Algorithm and an Application to the Othello Game
  • 本地全文:下载
  • 作者:Kazuteru Miyazaki ; Sougo Tsuboi ; Shigenobu Kobayashi
  • 期刊名称:人工知能学会論文誌
  • 印刷版ISSN:1346-0714
  • 电子版ISSN:1346-8030
  • 出版年度:2002
  • 卷号:17
  • 期号:5
  • 页码:548-556
  • DOI:10.1527/tjsai.17.548
  • 出版社:The Japanese Society for Artificial Intelligence
  • 摘要:The purpose of reinforcement learning is to learn an optimal policy in general. However, in 2-players games such as the othello game, it is important to acquire a penalty avoiding policy. In this paper, we focus on formation of a penalty avoiding policy based on the Penalty Avoiding Rational Policy Making algorithm [Miyazaki 01]. In applying it to large-scale problems, we are confronted with the curse of dimensionality. We introduce several ideas and heuristics to overcome the combinational explosion in large-scale problems. First, we propose an algorithm to save the memory by calculation of state transition. Second, we describe how to restrict exploration by two type knowledge; KIFU database and evaluation funcion. We show that our learning player can always defeat against the well-known othello game program KITTY.
  • 关键词:reinforcement learning ; reward and penalty ; penalty avoiding rational policy making ; the othello game ; KITTY
国家哲学社会科学文献中心版权所有