首页    期刊浏览 2024年11月24日 星期日
登录注册

文章基本信息

  • 标题:A Design of an Automatic Web Page ClassificationSystem
  • 本地全文:下载
  • 作者:Tarek M. Mahmoud ; Tarek Abd-El-Hafeez ; Doha Taha Nour El-Deen
  • 期刊名称:Current Journal of Applied Science and Technology
  • 印刷版ISSN:2457-1024
  • 出版年度:2017
  • 卷号:18
  • 期号:6
  • 页码:1-14
  • 语种:English
  • 出版社:Sciencedomain International
  • 摘要:Web Page Classification is one of the common problems of the today's Internet. In this paper, an automatic Web page classification system is introduced. The proposed system tries to increase the accuracy of a web page classification via combine the well-known Naïve Bayesian algorithm, Support Vector Machine and K-Nearest Neighbor. The experimental results shows that the performance of classifying web page by hybrid Naïve Bayesian classifier, Support Vector Machine and K-Nearest Neighbor algorithm is better than using Naïve Bayesian alone as always used to get the highest and fastest classifier or using K-Nearest Neighbor alone or using Support Vector Machine alone to reduce the false positive rate and get highest accuracy. The experimental results, applied on 10.000 web pages (30% for training process and 70% for testing process), showed a high efficiency with the less number of false positive rate (on average) 0%, the true positive rate (on average) 1%, F-measure (on average) 1% and overall accuracy rate (on average) 99.98%.
  • 关键词:Web page classification;naïve bayesian algorithm;support vector machine;K-nearestneighbor;support vector machine
国家哲学社会科学文献中心版权所有