首页    期刊浏览 2024年11月24日 星期日
登录注册

文章基本信息

  • 标题:A Critique and Improvement of an Evaluation Metric for Text Segmentation
  • 本地全文:下载
  • 作者:Lev Pevzner ; Marti A. Hearst
  • 期刊名称:Computational Linguistics
  • 印刷版ISSN:0891-2017
  • 电子版ISSN:1530-9312
  • 出版年度:2002
  • 卷号:28
  • 期号:1
  • 页码:19-36
  • DOI:10.1162/089120102317341756
  • 语种:English
  • 出版社:MIT Press
  • 摘要:The Pk evaluation metric, initially proposed by Beeferman, Berger, and Lafferty (1997), is becoming the standard measure for assessing text segmentation algorithms. However, a theoretical analysis of the metric finds several problems: the metric penalizes false negatives more heavily than false positives, overpenalizes near misses, and is affected by variation in segment size distribution. We propose a simple modification to the Pk metric that remedies these problems. This new metric—called Window Diff—moves a fixed-sized window across the text and penalizes the algorithm whenever the number of boundaries within the window does not match the true number of boundaries for that window of text.
国家哲学社会科学文献中心版权所有