首页    期刊浏览 2024年10月01日 星期二
登录注册

文章基本信息

  • 标题:Computing monotone policies for Markov decision processes: a nearly-isotonic penalty approach * * This work was partially supported by the Swedish Research Council under contract 2016-06079 and the Linnaeus Center ACCESS at KTH.
  • 本地全文:下载
  • 作者:Robert Mattila ; Cristian R. Rojas ; Vikram Krishnamurthy
  • 期刊名称:IFAC PapersOnLine
  • 印刷版ISSN:2405-8963
  • 出版年度:2017
  • 卷号:50
  • 期号:1
  • 页码:8429-8434
  • DOI:10.1016/j.ifacol.2017.08.1575
  • 语种:English
  • 出版社:Elsevier
  • 摘要:AbstractThis paper discusses algorithms for solvingMarkov decision processes(MDPs) that have monotone optimal policies. We propose a two-stage alternating convex optimization scheme that can accelerate the search for an optimal policy by exploiting the monotone property The first stage is a linear program formulated in terms of the joint state-action probabilities. The second stage is a regularized problem formulated in terms of the conditional probabilities of actions given states. The regularization uses techniques fromnearly-isotonic regression. While a variety of iterative method can be used in the first formulation of the problem, we show in numerical simulations that, in particular, thealternating method of multipliers(ADMM) can be significantly accelerated using the regularization step.
  • 关键词:Keywordsstochastic controlMarkov decision process (MDP)l1-regularizationsparsitymonotone policyalternating direction method of multipliers (ADMM)isotonic regression
国家哲学社会科学文献中心版权所有