首页    期刊浏览 2024年09月20日 星期五
登录注册

文章基本信息

  • 标题:Reliability Measurement without Limits
  • 本地全文:下载
  • 作者:Dennis Reidsma ; Jean Carletta
  • 期刊名称:Computational Linguistics
  • 印刷版ISSN:0891-2017
  • 电子版ISSN:1530-9312
  • 出版年度:2008
  • 卷号:34
  • 期号:3
  • 页码:319-326
  • DOI:10.1162/coli.2008.34.3.319
  • 语种:English
  • 出版社:MIT Press
  • 摘要:In computational linguistics, a reliability measurement of 0.8 on some statistic such as κ is widely thought to guarantee that hand-coded data is fit for purpose, with 0.67 to 0.8 tolerable, and lower values suspect. We demonstrate that the main use of such data, machine learning, can tolerate data with low reliability as long as any disagreement among human coders looks like random noise. When the disagreement introduces patterns, however, the machine learner can pick these up just like it picks up the real patterns in the data, making the performance figures look better than they really are. For the range of reliability measures that the field currently accepts, disagreement can appreciably inflate performance figures, and even a measure of 0.8 does not guarantee that what looks like good performance really is. Although this is a commonsense result, it has implications for how we work. At the very least, computational linguists should look for any patterns in the disagreement among coders and assess what impact they will have.
国家哲学社会科学文献中心版权所有