首页    期刊浏览 2024年11月06日 星期三
登录注册

文章基本信息

  • 标题:RuLearn: an Open-source Toolkit for the Automatic Inference of Shallow-transfer Rules for Machine Translation
  • 本地全文:下载
  • 作者:Víctor M. Sánchez-Cartagena ; Juan Antonio Pérez-Ortiz ; Felipe Sánchez-Martínez
  • 期刊名称:The Prague Bulletin of Mathematical Linguistics
  • 印刷版ISSN:0032-6585
  • 电子版ISSN:1804-0462
  • 出版年度:2016
  • 卷号:106
  • 期号:1
  • 页码:193-204
  • DOI:10.1515/pralin-2016-0018
  • 语种:English
  • 出版社:Walter de Gruyter GmbH
  • 摘要:This paper presents ruLearn, an open-source toolkit for the automatic inference of rules for shallow-transfer machine translation from scarce parallel corpora and morphological dictionaries. ruLearn will make rule-based machine translation a very appealing alternative for under-resourced language pairs because it avoids the need for human experts to handcraft transfer rules and requires, in contrast to statistical machine translation, a small amount of parallel corpora (a few hundred parallel sentences proved to be sufficient). The inference algorithm implemented by ruLearn has been recently published by the same authors in Computer Speech & Language (volume 32). It is able to produce rules whose translation quality is similar to that obtained by using hand-crafted rules. ruLearn generates rules that are ready for their use in the Apertium platform, although they can be easily adapted to other platforms. When the rules produced by ruLearn are used together with a hybridisation strategy for integrating linguistic resources from shallow-transfer rule-based machine translation into phrase-based statistical machine translation (published by the same authors in Journal of Artificial Intelligence Research, volume 55), they help to mitigate data sparseness. This paper also shows how to use ruLearn and describes its implementation.
国家哲学社会科学文献中心版权所有