首页    期刊浏览 2024年11月05日 星期二
登录注册

文章基本信息

  • 标题:Effects of Using Filter Based Feature Selection on the Performance of Machine Learners Using Different Datasets
  • 本地全文:下载
  • 作者:Mehnaz Khan ; S. M. K. Quadri
  • 期刊名称:BVICAM's International Journal of Information Technology
  • 印刷版ISSN:0973-5658
  • 出版年度:2013
  • 卷号:5
  • 期号:2
  • 语种:English
  • 出版社:Bharati Vidyapeeth's Institute of Computer Applications and Management
  • 摘要:Data preprocessing is a very important task in machine learning applications. It includes the methods of data cleaning, normalization, integration, transformation, reduction, feature extraction and selection. Feature selection is the technique for selecting smaller feature subsets from the superset of original features/attributes in order to avoid irrelevant and additional features/attributes in the dataset and hence increases the accuracy rate of machine learning algorithms. However, the problem exists when the further removal of such features results in the decrease of the accuracy rate. Therefore, we need to find an optimal subset of features that is neither too large nor too small from the superset of original features. This paper reviews different feature selection methods- filter, wrapper and embedded, that help in selecting the optimal feature subsets. Further, the paper shows effects of feature selection on different machine learning algorithms- NaiveBayes, RandomForest and kNN). The results have shown different effects on the accuracy rates while selecting the features at different margins.
  • 关键词:Index Terms - Data preprocessing;feature extraction;feature selection;dataset.
国家哲学社会科学文献中心版权所有