其他摘要:A reward function estimated with inverse reinforcement learning has been used to determine a method for controlling a robot. The number of state transitions that can be observed using the action sequence given to inverse reinforcement learning decreases drastically as the complexity of the planning problem increases. Although a reward function can be designed even from partial state transition information, ambiguity in the obtained reward function exists. A reward function that can tolerate ambiguity is required when learning with a reward function that includes ambiguity. In this paper, we propose a method for quantifying the ambiguity of the reward function, which is designed with inverse reinforcement learning using fuzzy reasoning. Experimental results was suggested that proposed method can learn the action sequence while considering the degree of risk and safety.
关键词:逆強化学習;ファジィ推論;報酬関数
其他关键词:inverse reinforcement learning;fuzzy reasoning;reward function