首页    期刊浏览 2024年09月15日 星期日
登录注册

文章基本信息

  • 标题:Variable Importance Plots—An Introduction to the vip Package
  • 本地全文:下载
  • 作者:Brandon M. Greenwell ; Bradley C. Boehmke
  • 期刊名称:R News
  • 印刷版ISSN:1609-3631
  • 出版年度:2020
  • 卷号:12
  • 期号:1
  • 页码:343-366
  • 语种:English
  • 出版社:The R Foundation for Statistical Computing
  • 摘要:In the era of “big data”, it is becoming more of a challenge to not only build state-of-the-artpredictive models, but also gain an understanding of what’s really going on in the data. For example,it is often of interest to know which, if any, of the predictors in a fitted model are relatively influentialon the predicted outcome. Some modern algorithms—like random forests (RFs) and gradient boosteddecision trees (GBMs)—have a natural way of quantifying the importance or relative influence ofeach feature. Other algorithms—like naive Bayes classifiers and support vector machines—are notcapable of doing so and model-agnostic approaches are generally used to measure each predictor’simportance. Enter vip , an R package for constructing variable importance scores/plots for manytypes of supervised learning algorithms using model-specific and novel model-agnostic approaches.We’ll also discuss a novel way to display both feature importance and feature effects together usingsparklines, a very small line chart conveying the general shape or variation in some feature that canbe directly embedded in text or tables.
国家哲学社会科学文献中心版权所有