首页    期刊浏览 2024年11月26日 星期二
登录注册

文章基本信息

  • 标题:Smaller Alignment Models for Better Translations: Unsupervised Word Alignment with the l0-norm
  • 本地全文:下载
  • 作者:Ashish Vaswani ; Liang Huang ; David Chiang
  • 期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
  • 出版年度:2012
  • 卷号:2012
  • 出版社:ACL Anthology
  • 摘要:Two decades after their invention, the IBM word-based translation models, widely available in the GIZA++ toolkit, remain the dominant approach to word alignment and an integral part of many statistical translation systems. Although many models have surpassed them in accuracy, none have supplanted them in practice. In this paper, we propose a simple extension to the IBM models: an `0 prior to encourage sparsity in the word-to-word translation model.We explain how to implement this extension efficiently for large-scale data (also released as a modification to GIZA++) and demonstrate, in experiments on Czech, Arabic, Chinese, and Urdu to English translation, significant improvements over IBM Model 4 in both word alignment (up to +6.7 F1) and translation quality (up to +1.4 Bleu).
国家哲学社会科学文献中心版权所有