文章基本信息

标题：Robust Understanding in Multimodal Interfaces
本地全文：下载
作者：Srinivas Bangalore ; Michael Johnston
期刊名称：Computational Linguistics
印刷版ISSN：0891-2017
电子版ISSN：1530-9312
出版年度：2009
卷号：35
期号：3
页码：345-397
DOI：10.1162/coli.08-022-R2-06-26
语种：English
出版社：MIT Press
摘要：Multimodal grammars provide an effective mechanism for quickly creating integration and understanding capabilities for interactive systems supporting simultaneous use of multiple input modalities. However, like other approaches based on hand-crafted grammars, multimodal grammars can be brittle with respect to unexpected, erroneous, or disfluent input. In this article, we show how the finite-state approach to multimodal language processing can be extended to support multimodal applications combining speech with complex freehand pen input, and evaluate the approach in the context of a multimodal conversational system (MATCH). We explore a range of different techniques for improving the robustness of multimodal integration and understanding. These include techniques for building effective language models for speech recognition when little or no multimodal training data is available, and techniques for robust multimodal understanding that draw on classification, machine translation, and sequence edit methods. We also explore the use of edit-based methods to overcome mismatches between the gesture stream and the speech stream.