首页    期刊浏览 2025年02月21日 星期五
登录注册

文章基本信息

  • 标题:COMPARING TWO FEATURE SELECTIONS METHODS (INFORMATION GAIN AND GAIN RATIO) ON THREE DIFFERENT CLASSIFICATION ALGORITHMS USING ARABIC DATASET
  • 本地全文:下载
  • 作者:ADEL HAMDAN MOHAMMAD
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2018
  • 卷号:96
  • 期号:6
  • 出版社:Journal of Theoretical and Applied
  • 摘要:Text classification is a very important topic. Nowadays there are a huge amount of available data. This data need to be classified and categorized. Number of researches applied on Arabic dataset still need more investigation. There are several available methods and techniques to classify data. Also, there are several feature selection methods used in pre-processing stage. Experiments in this paper done using two feature selections (information gain and gain ratio) on three classification methods (Na�ve Bayes, Decision tree C4.5 and Support Vector Machine SVM. Experiments done using Arabic dataset. Results shows that feature selection done in pre-processing stage is a key success factor for success. Also results demonstrate that gain ratio is a little bit better than information gain and SVM is approximately very closed to Na�ve Bayes and both of SVM and Na�ve Bayes is more accurate than C4.5.
  • 关键词:Text classification; information retrieval; Na�ve Bayes; Support Vector Machine; Decision tree; C4.5.
国家哲学社会科学文献中心版权所有