首页    期刊浏览 2024年11月25日 星期一
登录注册

文章基本信息

  • 标题:Improving Evaluation of Document-level Machine Translation Quality Estimation
  • 本地全文:下载
  • 作者:Yvette Graham ; Qingsong Ma ; Timothy Baldwin
  • 期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
  • 出版年度:2017
  • 卷号:2017
  • 页码:356-361
  • 语种:English
  • 出版社:ACL Anthology
  • 摘要:Meaningful conclusions about the relative performance of NLP systems are only possible if the gold standard employed in a given evaluation is both valid and reliable. In this paper, we explore the validity of human annotations currently employed in the evaluation of document-level quality estimation for machine translation (MT). We demonstrate the degree to which MT system rankings are dependent on weights employed in the construction of the gold standard, before proposing direct human assessment as a valid alternative. Experiments show direct assessment (DA) scores for documents to be highly reliable, achieving a correlation of above 0.9 in a self-replication experiment, in addition to a substantial estimated cost reduction through quality controlled crowd-sourcing. The original gold standard based on post-edits incurs a 10–20 times greater cost than DA.
国家哲学社会科学文献中心版权所有