首页    期刊浏览 2024年09月12日 星期四
登录注册

文章基本信息

  • 标题:Exploiting Distributional Temporal Difference Learning to Deal with Tail Risk
  • 本地全文:下载
  • 作者:Peter Bossaerts ; Shijie Huang ; Nitin Yadav
  • 期刊名称:Risks
  • 印刷版ISSN:2227-9091
  • 出版年度:2020
  • 卷号:8
  • 期号:113
  • 页码:113
  • DOI:10.3390/risks8040113
  • 语种:English
  • 出版社:MDPI, Open Access Journal
  • 摘要:In traditional Reinforcement Learning (RL), agents learn to optimize actions in a dynamic context based on recursive estimation of expected values. We show that this form of machine learning fails when rewards (returns) are affected by tail risk, i.e., leptokurtosis. Here, we adapt a recent extension of RL, called distributional RL (disRL), and introduce estimation efficiency, while properly adjusting for differential impact of outliers on the two terms of the RL prediction error in the updating equations. We show that the resulting “efficient distributional RL” (e-disRL) learns much faster, and is robust once it settles on a policy. Our paper also provides a brief, nontechnical overview of machine learning, focusing on RL.
国家哲学社会科学文献中心版权所有