出版社:The Japanese Society for Artificial Intelligence
摘要:When a robot interacts with users in public spaces, it receives various sounds such as surrounding noises and users' voices. And furthermore, the robot needs to interact with multiple people at the same time. If the robot incorrectly determines whether it should respond to these sounds, it will erroneously respond to surrounding noises or ignore user utterances directed to the robot. In this paper, we present a machine learning-based method to estimate a response obligation, i.e., whether the robot should respond to an input sound. We address a problem setting that is more similar to interactions in public spaces than those assumed in previous studies. While previous studies assume only utterances directed to one of interlocutors as input sounds, we deal with not only those utterances but also noises and monologues. To deal with various sounds, our method uses the results of input sound classification and user behaviors both in an input sound interval and after the interval. In particular, the user behaviors after the interval are introduced as a key factor for improving the estimation accuracy of response obligation, such as a tendency that a user stands and keeps still after he/she talks to the robot. We demonstrate the new features significantly improved the estimation performance. We also investigate performances with various combinations of features and reveal that the results of input sound classification and the user behaviors after the interval are helpful for the estimation.
关键词:human-robot;interaction;multi-party dialogue;multimodal information