首页    期刊浏览 2024年10月05日 星期六
登录注册

文章基本信息

  • 标题:Variable Importance Score
  • 本地全文:下载
  • 作者:Wei-Yin Loh ; Peigen Zhou
  • 期刊名称:Journal of Data Science
  • 印刷版ISSN:1680-743X
  • 电子版ISSN:1683-8602
  • 出版年度:2021
  • 卷号:19
  • 期号:8
  • 页码:569-592
  • DOI:10.6339/21-JDS1023
  • 语种:English
  • 出版社:Tingmao Publish Company
  • 摘要:There are many methods of scoring the importance of variables in prediction of a response but not much is known about their accuracy. This paper partially fills the gap by introducing a new method based on the GUIDE algorithm and comparing it with 11 existing methods. For data without missing values, eight methods are shown to give biased scores that are too high or too low, depending on the type of variables (ordinal, binary or nominal) and whether or not they are dependent on other variables, even when all of them are independent of the response. Among the remaining four methods, only GUIDE continues to give unbiased scores if there are missing data values. It does this with a self-calibrating bias-correction step that is applicable to data with and without missing values. GUIDE also provides threshold scores for differentiating important from unimportant variables with 95 and 99 percent confidence. Correlations of the scores to the predictive power of the methods are studied in three real data sets. For many methods, correlations with marginal predictive power are much higher than with conditional predictive power.
国家哲学社会科学文献中心版权所有