首页    期刊浏览 2024年12月01日 星期日
登录注册

文章基本信息

  • 标题:Extreme Multiclass Classification Criteria
  • 本地全文:下载
  • 作者:Anna Choromanska ; Ish Kumar Jain
  • 期刊名称:Computation
  • 电子版ISSN:2079-3197
  • 出版年度:2019
  • 卷号:7
  • 期号:1
  • 页码:16-34
  • DOI:10.3390/computation7010016
  • 出版社:MDPI Publishing
  • 摘要:We analyze the theoretical properties of the recently proposed objective function for efficient online construction and training of multiclass classification trees in the settings where the label space is very large. We show the important properties of this objective and provide a complete proof that maximizing it simultaneously encourages balanced trees and improves the purity of the class distributions at subsequent levels in the tree. We further explore its connection to the three well-known entropy-based decision tree criteria, i.e., Shannon entropy, Gini-entropy and its modified variant, for which efficient optimization strategies are largely unknown in the extreme multiclass setting. We show theoretically that this objective can be viewed as a surrogate function for all of these entropy criteria and that maximizing it indirectly optimizes them as well. We derive boosting guarantees and obtain a closed-form expression for the number of iterations needed to reduce the considered entropy criteria below an arbitrary threshold. The obtained theorem relies on a weak hypothesis assumption that directly depends on the considered objective function. Finally, we prove that optimizing the objective directly reduces the multi-class classification error of the decision tree.
  • 关键词:multiclass classification; decision trees; boosting multiclass classification ; decision trees ; boosting
国家哲学社会科学文献中心版权所有