首页    期刊浏览 2024年08月31日 星期六
登录注册

文章基本信息

  • 标题:Model-Free Reinforcement Learning for Stochastic Parity Games
  • 本地全文:下载
  • 作者:Ernst Moritz Hahn ; Mateo Perez ; Sven Schewe
  • 期刊名称:LIPIcs : Leibniz International Proceedings in Informatics
  • 电子版ISSN:1868-8969
  • 出版年度:2020
  • 卷号:171
  • 页码:21:1-21:16
  • DOI:10.4230/LIPIcs.CONCUR.2020.21
  • 出版社:Schloss Dagstuhl -- Leibniz-Zentrum fuer Informatik
  • 摘要:This paper investigates the use of model-free reinforcement learning to compute the optimal value in two-player stochastic games with parity objectives. In this setting, two decision makers, player Min and player Max, compete on a finite game arena - a stochastic game graph with unknown but fixed probability distributions - to minimize and maximize, respectively, the probability of satisfying a parity objective. We give a reduction from stochastic parity games to a family of stochastic reachability games with a parameter ε, such that the value of a stochastic parity game equals the limit of the values of the corresponding simple stochastic games as the parameter ε tends to 0. Since this reduction does not require the knowledge of the probabilistic transition structure of the underlying game arena, model-free reinforcement learning algorithms, such as minimax Q-learning, can be used to approximate the value and mutual best-response strategies for both players in the underlying stochastic parity game. We also present a streamlined reduction from 1 1/2-player parity games to reachability games that avoids recourse to nondeterminism. Finally, we report on the experimental evaluations of both reductions.
  • 关键词:Reinforcement learning; Stochastic games; Omega-regular objectives
国家哲学社会科学文献中心版权所有