摘要:AbstractOptimizing human-robot interaction in the performance of collaborative tasks is a challenging problem. An important aspect of this problem relates to the robot's understanding of human intentions. Empowering robots with accurate intention inference capabilities sets the stage for more natural, safe, and effcient interactions and greater confidence in the Human-Robot Interaction domain. Intentions can be deduced by observing human cues as they interact with the environment, but currently, there is no clear-cut method for achieving this in an effective way. Here, we present a novel method for intention inference based on the integration of three visual cues, namely, hand movement, eye fixation, and object interaction, coupled with a bidirectional LSTM neural network for the classification of human intention. Experimental studies evaluating our approach against two-visual cue alternatives confirm the utility of our approach.