期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度:2006
卷号:2006
出版社:ACL Anthology
摘要:Multimodal grammars provide an expressive
formalism for multimodal integration
and understanding. However, handcrafted
multimodal grammars can be brittle
with respect to unexpected, erroneous,
or disfluent inputs. Spoken language
(speech-only) understanding systems have
addressed this issue of lack of robustness
of hand-crafted grammars by exploiting
classification techniques to extract fillers
of a frame representation. In this paper,
we illustrate the limitations of such classification
approaches for multimodal integration
and understanding and present
an approach based on edit machines that
combine the expressiveness of multimodal
grammars with the robustness of stochastic
language models of speech recognition.
We also present an approach where the
edit operations are trained from data using
a noisy channel model paradigm. We evaluate
and compare the performance of the
hand-crafted and learned edit machines in
the context of a multimodal conversational
system (MATCH).