首页    期刊浏览 2024年09月20日 星期五
登录注册

文章基本信息

  • 标题:Consensus of Feature Selection Methods and Reduced Generalization Gap Model to Improve Diagnosis of Heart Disease
  • 本地全文:下载
  • 作者:S.Gupta ; R.R.Sedamkar
  • 期刊名称:Journal of Scientific Research
  • 印刷版ISSN:2070-0237
  • 电子版ISSN:2070-0245
  • 出版年度:2021
  • 卷号:13
  • 期号:3
  • 页码:901-913
  • DOI:10.3329/jsr.v13i3.53290
  • 语种:English
  • 出版社:Rajshahi University
  • 摘要:Enhancing the diagnostic ability of Machine Learning models for acceptable prediction in the healthcare community is still a concern.There are critical care disease datasets available online on which researchers have experimented with a different number of instances and features for similar disease prediction.Further, different Machine Learning (ML) models have different preprocessing requirements.Framingham heart disease data is multicollinear and has missing values.Thus, the proposed model aims to explore the differential preprocessing needs of ML models followed by feature selection in consensus with domain experts and feature extraction to resolve multicollinearity issues.Missing values have been imputed differently for each feature.The work also identifies optimal train set size by plotting a learning curve that provides a minimum generalization gap.When testing is done on this hyperparameter tuned model, performance is enhanced with respect to the F score weighted by support and stratification since the data is imbalanced.Experimental results demonstrate improvement in performance metrics, i.e., weighted F score, precision, recall, accuracy up to 3 %, and F1 score by 8 % for Logistic Regression Classifier with the proposed model.Further, the time required for hyperparameter tuning is reduced by 50% for tree-based models, particularly Classification and Regression Tree (CART).
国家哲学社会科学文献中心版权所有