期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度:2011
卷号:2011
出版社:ACL Anthology
摘要:The language model (LM) is a critical component
in most statistical machine translation
(SMT) systems, serving to establish a probability
distribution over the hypothesis space.
Most SMT systems use a static LM, independent
of the source language input. While
previous work has shown that adapting LMs
based on the input improves SMT performance,
none of the techniques has thus far
been shown to be feasible for on-line systems.
In this paper, we develop a novel measure
of cross-lingual similarity for biasing the
LM based on the test input. We also illustrate
an efficient on-line implementation that supports
integrationwith on-line SMT systems by
transferring much of the computational load
off-line. Our approach yields significant reductions
in target perplexity compared to the
static LM, as well as consistent improvements
in SMT performance across language pairs
(English-Dari and English-Pashto).