期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2018
卷号:96
期号:6
出版社:Journal of Theoretical and Applied
摘要:Text classification is a very important topic. Nowadays there are a huge amount of available data. This data need to be classified and categorized. Number of researches applied on Arabic dataset still need more investigation. There are several available methods and techniques to classify data. Also, there are several feature selection methods used in pre-processing stage. Experiments in this paper done using two feature selections (information gain and gain ratio) on three classification methods (Na�ve Bayes, Decision tree C4.5 and Support Vector Machine SVM. Experiments done using Arabic dataset. Results shows that feature selection done in pre-processing stage is a key success factor for success. Also results demonstrate that gain ratio is a little bit better than information gain and SVM is approximately very closed to Na�ve Bayes and both of SVM and Na�ve Bayes is more accurate than C4.5.
关键词:Text classification; information retrieval; Na�ve Bayes; Support Vector Machine; Decision tree; C4.5.