期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2018
卷号:96
期号:20
出版社:Journal of Theoretical and Applied
摘要:Nowadays, accessing the Internet has become interesting for the people�s life. It will be promising if it can accurately predict the popularity of news prior to its publication. Online classification is well suited for learning from large and high dimensional dataset. The main objective of this research work is to predict and evaluate the popularity of online news. Several approaches of feature selection will be adopted to reduce the dataset to improve the classification and prediction accuracy. Some filtering approaches will be used such as correlation, information gain and relief to remove the non-important features so that the classification of new instances will be more accurate. The above mentioned approaches will be presented for selecting the most significant features in the dataset and then providing comparison among their performance. Moreover, Bayes Network and K-Nearest Neighbors algorithms are trained for classification and prediction. The training set is used to construct the models while the testing set is used for validation. This work will be operated and tested using a dataset taken from the UCI machine learning repository containing thousands of articles with sixty-two attributes. A feature selection method is proposed based on features' extraction and/or features' fusion. A comparative study is done among the adopted methods and the novel proposed one. The performance of the adopted classification and prediction models and/or approaches will consider some measurable criteria such as precision, recall, accuracy and error for highlighting the advantages and disadvantages of the adopted approaches and the proposed one. From the experimental work, the performance of the proposed method is promising and outperforms those adopted ones.
关键词:Feature Selection; Classification Methods; Popularity Prediction; High Dimensional Datasets; and Online News.