首页    期刊浏览 2024年10月03日 星期四
登录注册

文章基本信息

  • 标题:Visualizing Random Forest’s Prediction Results
  • 本地全文:下载
  • 作者:Hudson F. Golino 1* , Cristiano Mauro Assis Gomes
  • 期刊名称:Psychology
  • 印刷版ISSN:2152-7180
  • 电子版ISSN:2152-7199
  • 出版年度:2014
  • 卷号:05
  • 期号:19
  • 页码:2084-2098
  • DOI:10.4236/psych.2014.519211
  • 语种:English
  • 出版社:Scientific Research Publishing
  • 摘要:The current paper proposes a new visualization tool to help check the quality of the random forest predictions by plotting the proximity matrix as weighted networks. This new visualization technique will be compared with the traditional multidimensional scale plot. The present paper also introduces a new accuracy index (proportion of misplaced cases), and compares it to total accuracy, sensitivity and specificity. It also applies cluster coefficients to weighted graphs, in order to understand how well the random forest algorithm is separating two classes. Two datasets were analyzed, one from a medical research (breast cancer) and the other from a psychology research (medical student’s academic achievement), varying the sample sizes and the predictive accuracy. With different number of observations and different possible prediction accuracies, it was possible to compare how each visualization technique behaves in each situation. The results pointed that the visualization of random forest’s predictive performance was easier and more intuitive to interpret using the weighted network of the proximity matrix than using the multidimensional scale plot. The proportion of misplaced cases was highly related to total accuracy, sensitivity and specificity. This strategy, together with the computation of Zhang and Horvath’s (2005) clustering coefficient for weighted graphs, can be very helpful in understanding how well a random forest prediction is doing in terms of classification.
  • 关键词:Machine Learning; Assessment; Prediction; Visualization; Networks; Cluster
国家哲学社会科学文献中心版权所有