文章基本信息

标题：Cross Entropy Optimization of Action Modification Policies for Continuous-Valued MDPs ⁎
本地全文：下载
作者：Kamelia Mirkamali ; Lucian Buşoniu
期刊名称：IFAC PapersOnLine
印刷版ISSN：2405-8963
出版年度：2020
卷号：53
期号：2
页码：8124-8129
DOI：10.1016/j.ifacol.2020.12.2292
语种：English
出版社：Elsevier
摘要：AbstractWe propose an algorithm to search for parametrized policies in continuous state and action Markov Decision Processes (MDPs). The policies are represented via a number of basis functions, and the main novelty is that each basis function corresponds to a small, discrete modification of the continuous action. In each state, the policy chooses a discrete action modification associated with a basis function having the maximum value at the current state. Empirical returns from a representative set of initial states are estimated in simulations to evaluate the policies. Instead of using slow gradient-based algorithms, we apply cross entropy method for updating the parameters. The proposed algorithm is applied to a double integrator and an inverted pendulum problem, with encouraging results.
关键词：KeywordsMarkov decision processespolicy searchcross-entropy optimizationcontinuous actions