摘要:Classification-based reinforcement learning (RL) methods have recently been pro-posed as an alternative to the traditional value-function based methods. These methods use a classifier to represent a policy, where the input (features) to the classifier is the state and theoutput (class label) for that state is the desired action. The reinforcement-learning community knows that focusing on more important states can lead to improved performance. In this paper,we investigate the idea of focused learning in the context of classification-based RL. Specifically, we define a useful notation of state importance, which we use to prove rigorous bounds on policyloss. Furthermore, we show that a classification-based RL agent may behave arbitrarily poorly if it treats all states as equally important.
关键词:attention, function approximation, generalization, reinforcement learning