首页    期刊浏览 2024年12月03日 星期二
登录注册

文章基本信息

  • 标题:The Cross-Entropy Method for Policy Search in Decentralized POMDPs
  • 本地全文:下载
  • 作者:Frans A. Oliehoek ; Julian F.P. Kooij ; Nikos Vlassis
  • 期刊名称:Informatica
  • 印刷版ISSN:1514-8327
  • 电子版ISSN:1854-3871
  • 出版年度:2008
  • 卷号:32
  • 期号:4
  • 出版社:The Slovene Society Informatika, Ljubljana
  • 摘要:DecentralizedPOMDPs (Dec-POMDPs)are becomingincreasinglypopularas modelsformultiagentplan- ning under uncertainty,but solving a Dec-POMDP exactly is known to be an intractable combinatorial op- timization problem. In this paper we apply the Cross-Entropy (CE) method, a recently introduced method for combinatorial optimization, to Dec-POMDPs, resulting in a randomized (sampling-based) algorithm for approximately solving Dec-POMDPs. This algorithm operates by sampling pure policies from an ap- propriatelyparametrizedstochasticpolicy,andthenevaluatesthesepolicieseitherexactlyorapproximately in order to define the next stochastic policy to sample from, and so on until convergence. Experimental results demonstrate that the CE method can search huge spaces efficiently, supporting our claim that com- binatorial optimization methods can bring leverage to the approximate solution of Dec-POMDPs.
  • 关键词:multiagent planning; decentralized POMDPs; combinatorial optimization
国家哲学社会科学文献中心版权所有