首页    期刊浏览 2025年06月09日 星期一
登录注册

文章基本信息

  • 标题:A New Hybrid Metric for Verifying Parallel Corpora of Arabic English
  • 本地全文:下载
  • 作者:Saad Alkahtani ; Wei Liu ; William J. Teahan
  • 期刊名称:Computer Science & Information Technology
  • 电子版ISSN:2231-5403
  • 出版年度:2015
  • 卷号:5
  • 期号:2
  • 页码:123-139
  • DOI:10.5121/csit.2015.50211
  • 出版社:Academy & Industry Research Collaboration Center (AIRCC)
  • 摘要:This paper discusses a new metric that has been applied to verify the quality in translationbetween sentence pairs in parallel corpora of Arabic-English. This metric combines twotechniques, one based on sentence length and the other based on compression code length.Experiments on sample test parallel Arabic-English corpora indicate the combination of thesetwo techniques improves accuracy of the identification of satisfactory and unsatisfactorysentence pairs compared to sentence length and compression code length alone. The newmethod proposed in this research is effective at filtering noise and reducing mis-translationsresulting in greatly improved quality.
  • 关键词:Parallel Corpus; Sentence Alignment for Machine Translation; Prediction by Partial Matching;Compression
国家哲学社会科学文献中心版权所有