首页    期刊浏览 2024年08月23日 星期五
登录注册

文章基本信息

  • 标题:Prevalence of neural collapse during the terminal phase of deep learning training
  • 本地全文:下载
  • 作者:Vardan Papyan ; X. Y. Han ; David L. Donoho
  • 期刊名称:Proceedings of the National Academy of Sciences
  • 印刷版ISSN:0027-8424
  • 电子版ISSN:1091-6490
  • 出版年度:2020
  • 卷号:117
  • 期号:40
  • 页码:24652-24663
  • DOI:10.1073/pnas.2015509117
  • 出版社:The National Academy of Sciences of the United States of America
  • 摘要:Modern practice for training classification deepnets involves a terminal phase of training (TPT), which begins at the epoch where training error first vanishes. During TPT, the training error stays effectively zero, while training loss is pushed toward zero. Direct measurements of TPT, for three prototypical deepnet architectures and across seven canonical classification datasets, expose a pervasive inductive bias we call neural collapse (NC), involving four deeply interconnected phenomena. (NC1) Cross-example within-class variability of last-layer training activations collapses to zero, as the individual activations themselves collapse to their class means. (NC2) The class means collapse to the vertices of a simplex equiangular tight frame (ETF). (NC3) Up to rescaling, the last-layer classifiers collapse to the class means or in other words, to the simplex ETF (i.e., to a self-dual configuration). (NC4) For a given activation, the classifier’s decision collapses to simply choosing whichever class has the closest train class mean (i.e., the nearest class center [NCC] decision rule). The symmetric and very simple geometry induced by the TPT confers important benefits, including better generalization performance, better robustness, and better interpretability.
  • 关键词:deep learning ; inductive bias ; adversarial robustness ; simplex equiangular tight frame ; nearest class center
国家哲学社会科学文献中心版权所有