首页    期刊浏览 2024年10月04日 星期五
登录注册

文章基本信息

  • 标题:Constructing Corpora for the Development and Evaluation of Paraphrase Systems
  • 本地全文:下载
  • 作者:Trevor Cohn ; Chris Callison-Burch ; Mirella Lapata
  • 期刊名称:Computational Linguistics
  • 印刷版ISSN:0891-2017
  • 电子版ISSN:1530-9312
  • 出版年度:2008
  • 卷号:34
  • 期号:4
  • 页码:597-614
  • DOI:10.1162/coli.08-003-R1-07-044
  • 语种:English
  • 出版社:MIT Press
  • 摘要:Automatic paraphrasing is an important component in many natural language processing tasks. In this article we present a new parallel corpus with paraphrase annotations. We adopt a definition of paraphrase based on word alignments and show that it yields high inter-annotator agreement. As Kappa is suited to nominal data, we employ an alternative agreement statistic which is appropriate for structured alignment tasks. We discuss how the corpus can be usefully employed in evaluating paraphrase systems automatically (e.g., by measuring precision, recall, and F1) and also in developing linguistically rich paraphrase models based on syntactic structure.
国家哲学社会科学文献中心版权所有