期刊名称:Practical Assessment, Research and Evaluation
印刷版ISSN:1531-7714
电子版ISSN:1531-7714
出版年度:2018
卷号:23
出版社:ERIC: Clearinghouse On Assessment and Evaluation
摘要:In institutional research, modern data mining approaches are seldom considered to addresspredictive analytics problems. The goal of this paper is to highlight the advantages of tree-basedmachine learning algorithms over classic (logistic) regression methods for data-informed decisionmaking in higher education problems, and stress the success of random forest in circumstanceswhere the regression assumptions are often violated in big data applications. Random forest is amodel averaging procedure where each tree is constructed based on a bootstrap sample of thedata set. In particular, we emphasize the ease of application, low computational cost, high predictiveaccuracy, flexibility, and interpretability of random forest machinery. Our overall recommendationis that institutional researchers look beyond classical regression and single decision tree analyticstools, and consider random forest as the predominant method for prediction tasks. The proposedpoints of view are detailed and illustrated through a simulation experiment and analyses of datafrom real institutional research projects.