首页    期刊浏览 2025年02月23日 星期日
登录注册

文章基本信息

  • 标题:Kalman Temporal Differences
  • 本地全文:下载
  • 作者:M. Geist ; O. Pietquin
  • 期刊名称:Journal of Artificial Intelligence Research
  • 印刷版ISSN:1076-9757
  • 出版年度:2010
  • 卷号:39
  • 页码:483-532
  • 出版社:American Association of Artificial
  • 摘要:Because reinforcement learning suffers from a lack of scalability, online value (and Q-) function approximation has received increasing interest this last decade. This contribution introduces a novel approximation scheme, namely the Kalman Temporal Differences (KTD) framework, that exhibits the following features: sample-efficiency, non-linear approximation, non-stationarity handling and uncertainty management. A first KTD-based algorithm is provided for deterministic Markov Decision Processes (MDP) which produces biased estimates in the case of stochastic transitions. Than the eXtended KTD framework (XKTD), solving stochastic MDP, is described. Convergence is analyzed for special cases for both deterministic and stochastic transitions. Related algorithms are experimented on classical benchmarks. They compare favorably to the state of the art while exhibiting the announced features.
国家哲学社会科学文献中心版权所有