首页    期刊浏览 2024年10月01日 星期二
登录注册

文章基本信息

  • 标题:Improving Statistical Machine Translation by Adapting Translation Models to Translationese
  • 本地全文:下载
  • 作者:Gennadi Lembersky ; Noam Ordan ; Shuly Wintner
  • 期刊名称:Computational Linguistics
  • 印刷版ISSN:0891-2017
  • 电子版ISSN:1530-9312
  • 出版年度:2013
  • 卷号:39
  • 期号:4
  • 页码:999-1023
  • DOI:10.1162/COLI_a_00159
  • 语种:English
  • 出版社:MIT Press
  • 摘要:Translation models used for statistical machine translation are compiled from parallel corpora that are manually translated. The common assumption is that parallel texts are symmetrical: The direction of translation is deemed irrelevant and is consequently ignored. Much research in Translation Studies indicates that the direction of translation matters, however, as translated language ( translationese ) has many unique properties. It has already been shown that phrase tables constructed from parallel corpora translated in the same direction as the translation task outperform those constructed from corpora translated in the opposite direction. We reconfirm that this is indeed the case, but emphasize the importance of also using texts translated in the “wrong” direction. We take advantage of information pertaining to the direction of translation in constructing phrase tables by adapting the translation model to the special properties of translationese. We explore two adaptation techniques: First, we create a mixture model by interpolating phrase tables trained on texts translated in the “right” and the “wrong” directions. The weights for the interpolation are determined by minimizing perplexity. Second, we define entropy-based measures that estimate the correspondence of target-language phrases to translationese, thereby eliminating the need to annotate the parallel corpus with information pertaining to the direction of translation. We show that incorporating these measures as features in the phrase tables of statistical machine translation systems results in consistent, statistically significant improvement in the quality of the translation.
国家哲学社会科学文献中心版权所有