文章基本信息

标题：Stochastic Enforced Hill-Climbing
本地全文：下载
作者：J. Wu ; R. Kalyanam ; R. Givan 等
期刊名称：Journal of Artificial Intelligence Research
印刷版ISSN：1076-9757
出版年度：2011
卷号：42
页码：815-850
出版社：American Association of Artificial
摘要：Enforced hill-climbing is an effective deterministic hill-climbing technique that deals with local optima using breadth-first search (a process called ``basin flooding''). We propose and evaluate a stochastic generalization of enforced hill-climbing for online use in goal-oriented probabilistic planning problems. We assume a provided heuristic function estimating expected cost to the goal with flaws such as local optima and plateaus that thwart straightforward greedy action choice. While breadth-first search is effective in exploring basins around local optima in deterministic problems, for stochastic problems we dynamically build and solve a heuristic-based Markov decision process (MDP) model of the basin in order to find a good escape policy exiting the local optimum. We note that building this model involves integrating the heuristic into the MDP problem because the local goal is to improve the heuristic. We evaluate our proposal in twenty-four recent probabilistic planning-competition benchmark domains and twelve probabilistically interesting problems from recent literature. For evaluation, we show that stochastic enforced hill-climbing (SEH) produces better policies than greedy heuristic following for value/cost functions derived in two very different ways: one type derived by using deterministic heuristics on a deterministic relaxation and a second type derived by automatic learning of Bellman-error features from domain-specific experience. Using the first type of heuristic, SEH is shown to generally outperform all planners from the first three international probabilistic planning competitions.