摘要:AbstractNonlinear control problems have been the main subjects in control engineering from theoretical and applicational aspects. Reinforcement learning shows promising results for solving highly nonlinear control problems. Among many variants of reinforcement learning, Deep Deterministic Policy Gradient (DDPG) considers continuous control signals, which makes it an ideal candidate for solving nonlinear control problems. The training requires frequently, however, a large number of computations. To improve the convergence of DDPG, we present a state-space segmentation method dividing the state-space to expand the target space defined by the best reward. An inverted pendulum control example demonstrates the performance of the proposed segmentation method.
关键词:Keywordsreinforcement learninglearning convergencerewardlinear control