期刊名称:International Journal of Grid and Distributed Computing
印刷版ISSN:2005-4262
出版年度:2016
卷号:9
期号:3
页码:243-250
DOI:10.14257/ijgdc.2016.9.3.25
出版社:SERSC
摘要:This paper studies ε-greedy algorithm and softmax algorithm in obstacle avoidance and balance study. In the experiment, Sarsa algorithm and Q-Learning algorithm were used to appropriately simplify and build the model of obstacle avoidance; softmax algorithm was used to address how to balance exploration and utilisation; and two classical algorithms of reinforcement learning were adopted to deal with obstacle avoidance. The results generated by simulation prove that Sarsa algorithm and Q- Learning algorithm can handle obstacle avoidance and balance study in limited time step, which makes the intelligent agent improve the non-maximum estimated value of the value function of the state so as to choose the best action that has been carried out. In addition, Sarsa algorithm and Q-Learning algorithm can also enable the intelligent agent to try new actions and find out the optimal one.