文章基本信息

标题：Deciphering Foreign Language by Combining Language Models and Context Vectors
本地全文：下载
作者：Malte Nuhn ; Arne Mauser ; Hermann Ney 等
期刊名称：Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度：2012
卷号：2012
出版社：ACL Anthology
摘要：In this paper we show how to train statistical machine translation systems on reallife tasks using only non-parallel monolingual data from two languages. We present a modification of the method shown in (Ravi and Knight, 2011) that is scalable to vocabulary sizes of several thousand words. On the task shown in (Ravi and Knight, 2011) we obtain better results with only 5% of the computational effort when running our method with an n-gram language model. The efficiency improvement of our method allows us to run experiments with vocabulary sizes of around 5,000 words, such as a non-parallel version of the VERBMOBIL corpus. We also report results using data from the monolingual French and English GIGAWORD corpora.