首页    期刊浏览 2024年10月06日 星期日
登录注册

文章基本信息

  • 标题:Impossibility of successful classification when useful features are rare and weak
  • 本地全文:下载
  • 作者:Jiashun Jin
  • 期刊名称:Proceedings of the National Academy of Sciences
  • 印刷版ISSN:0027-8424
  • 电子版ISSN:1091-6490
  • 出版年度:2009
  • 卷号:106
  • 期号:22
  • 页码:8859-8864
  • DOI:10.1073/pnas.0903931106
  • 语种:English
  • 出版社:The National Academy of Sciences of the United States of America
  • 摘要:We study a two-class classification problem with a large number of features, out of which many are useless and only a few are useful, but we do not know which ones they are. The number of features is large compared with the number of training observations. Calibrating the model with 4 key parameters--the number of features, the size of the training sample, the fraction, and strength of useful features--we identify a region in parameter space where no trained classifier can reliably separate the two classes on fresh data. The complement of this region--where successful classification is possible--is also briefly discussed.
  • 关键词:higher criticism ; phase diagram ; region of impossibility ; region of possibility ; threshold feature selection
国家哲学社会科学文献中心版权所有