文章基本信息

标题：Robust Reinforcement Learning for Stochastic Linear Quadratic Control with Multiplicative Noise ⁎
本地全文：下载
作者：Bo Pang ; Zhong-Ping Jiang
期刊名称：IFAC PapersOnLine
印刷版ISSN：2405-8963
出版年度：2021
卷号：54
期号：7
页码：240-243
DOI：10.1016/j.ifacol.2021.08.365
语种：English
出版社：Elsevier
摘要：AbstractThis paper studies the robustness of reinforcement learning for discrete-time linear stochastic systems with multiplicative noise evolving in continuous state and action spaces. As one of the popular methods in reinforcement learning, the robustness of policy iteration is a longstanding open issue for the stochastic linear quadratic regulator (LQR) problem with multiplicative noise. A solution in the spirit of small-disturbance input-to-state stability is given, guaranteeing that the solutions of the policy iteration algorithm are bounded and enter a small neighborhood of the optimal solution, whenever the error in each iteration is bounded and small. In addition, a novel off-policy multiple-trajectory optimistic least-squares policy iteration algorithm is proposed, to learn a near-optimal solution of the stochastic LQR problem directly from online input/state data, without explicitly identifying the system matrices. The efficacy of the proposed algorithm is supported by rigorous convergence analysis and numerical results on a second-order example.
关键词：KeywordsReinforcement learning controlStochastic optimal control problemsData-based controlRobustness analysisInput-to-state stability