文章基本信息

标题：A Neural Dynamic Programming Approach For Learning Control Of Failure Avoidance Problems
本地全文：下载
作者：Derong LIU ; Huaguang ZHANG
期刊名称：International Journal of Intelligent Control and Systems
印刷版ISSN：0218-7965
出版年度：2005
卷号：10
期号：1
页码：21-32
出版社：Westing Publishing Co., Fremont
摘要：In the present paper, we consider the implemen-tation of adaptive critic designs using neural networks. Westudy a class of adaptive critic designs that can be classified as(model-free) action-dependent heuristic dynamic programming(ADHDP). The present ADHDP is equivalent to the conventionalmodel-based heuristic dynamic programming (HDP) if we viewthe model network in the latter as completely embedded in thecritic network. This is a valid viewpoint since a neural networkconnected to another simply forms a larger neural network. Wewill present three approaches for the training of neural networksin our ADHDP. In particular, for the critic network training,these include non-batch and batch learning with calculated targetoutput values as well as batch learning with an analyticallyderived overall cost function as the target for learning. Theapplication considered in the present paper is the learning controlof failure avoidance problems for which we categorize using thechoice of local cost function as zero throughout a trial except atthe last time step when a failure occurs. For failure avoidanceproblems defined this way, we will derive an analytical form of itsoverall cost function which is defined as the infinite summationof the local cost function over time. We will use a benchmarkproblem of balancing the pole on a cart to demonstrate that thecritic network learning achieved in both non-batch and batchlearning with calculated target output values resemble well thelearning achieved in the case with the analytically derived overallcost function