期刊名称:Case Studies in Business, Industry and Government Statistics
印刷版ISSN:2152-372X
出版年度:2012
卷号:5
期号:1
页码:12-16
出版社:Bentley University
摘要:This paper provides a relatively new technique for predicting the retention of students in an actuarial mathematics program. The authors utilize data from a previous research study. In that study, logistic regression, classification trees, and neural networks were compared. The neural networks (with prior imputation of missing data) and classification trees (with no imputation required) were most accurate. However, in this paper, we examine the use of gradient boosting to improve the accuracy of classification trees. We focus on trees since they generate transparent rules that are easily interpretable, especially by non-statisticians. Gradient boosting is an enhancement that is applied specifically to decision trees, and we show that it does, at least in this study, improve the classification accuracy of our default tree. The exposition is accessible to readers with an intermediate level of statistics.
关键词:Logistic Regression; Data Mining; Neural Nets; Decision Trees; Gradient Boosting