首页    期刊浏览 2025年05月25日 星期日
登录注册

文章基本信息

  • 标题:Random Forest with Sampling Techniques for Handling Imbalanced Prediction of University Student Depression
  • 本地全文:下载
  • 作者:Siriporn Sawangarreerak ; Putthiporn Thanathamathee
  • 期刊名称:Information
  • 电子版ISSN:2078-2489
  • 出版年度:2020
  • 卷号:11
  • 期号:11
  • 页码:519-531
  • DOI:10.3390/info11110519
  • 出版社:MDPI Publishing
  • 摘要:In this work, we propose a combined sampling technique to improve the performance of imbalanced classification of university student depression data. In experimental results, we found that combined random oversampling with the Tomek links under sampling methods allowed generating a relatively balanced depression dataset without losing significant information. In this case, the random oversampling technique was used for sampling the minority class to balance the number of samples between the datasets. Then, the Tomek links technique was used for undersampling the samples by removing the depression data considered less relevant and noisy. The relatively balanced dataset was classified by random forest. The results show that the overall accuracy in the prediction of adolescent depression data was 94.17%, outperforming the individual sampling technique. Moreover, our proposed method was tested with another dataset for its external validity. This dataset’s predictive accuracy was found to be 93.33%.
  • 关键词:depression prediction; imbalanced data; sampling techniques; feature selection; Patient Health Questionnaire-9 (PHQ-9) depression prediction ; imbalanced data ; sampling techniques ; feature selection ; Patient Health Questionnaire-9 (PHQ-9)
国家哲学社会科学文献中心版权所有