首页    期刊浏览 2024年10月07日 星期一
登录注册

文章基本信息

  • 标题:Binary classification with corrupted labels
  • 本地全文:下载
  • 作者:Yonghoon Lee ; Rina Foygel Barber
  • 期刊名称:Electronic Journal of Statistics
  • 印刷版ISSN:1935-7524
  • 出版年度:2022
  • 卷号:16
  • 期号:1
  • 页码:1367-1392
  • DOI:10.1214/22-EJS1987
  • 语种:English
  • 出版社:Institute of Mathematical Statistics
  • 摘要:In a binary classification problem where the goal is to fit an accurate predictor, the presence of corrupted labels in the training data set may create an additional challenge. However, in settings where likelihood maximization is poorly behaved—for example, if positive and negative labels are perfectly separable—then a small fraction of corrupted labels can improve performance by ensuring robustness. In this work, we establish that in such settings, corruption acts as a form of regularization, and we compute precise upper bounds on estimation error in the presence of corruptions. Our results suggest that the presence of corrupted data points is beneficial only up to a small fraction of the total sample, scaling with the square root of the sample size.
  • 关键词:62H30;‎classification‎;label noise
国家哲学社会科学文献中心版权所有