期刊名称:International Journal of Computer Science and Information Technologies
电子版ISSN:0975-9646
出版年度:2016
卷号:7
期号:3
页码:1337-1342
出版社:TechScience Publications
摘要:The output of the system is a sequence of actions in someapplications. There is no such measure as the best action in anyin-between state; an action is excellent if it is part of a goodpolicy. A single action is not important; the policy is importantthat is the sequence of correct actions to reach the goal. To beable to generate a policy the machine learning programs shouldable to assess the quality of policies and learn from past goodaction sequences.Learning is the basic capacity of intelligent agents. An agentchanges its behavior based on its previous experiences throughlearning. An intelligent agent must be formalized by knowledgeand be able to act on this knowledge. In many single-agentsystems for learning the policy of an agent in uncertainenvironments, the reinforcement learning techniques have beenapplied successfully. Many existing single-agent models forsequential decision making are derived from a general model andare distinguished by assumptions. Q-learning algorithms areused for this purpose.Single agent learning model is given in this paper. Four singleagent reinforcement learning algorithms are implemented andresults are compared. Single agent Q-learning Algorithm andSarsa Learning Algorithm gives some results for the problem.However adding eligibility traces in single agent learningalgorithms i.e. Q(λ) learning and Sarsa(λ) learning givesperforms better than the previous algorithms. The paper showsthe results of all four algorithms and performance comparisonsamong them.