首页    期刊浏览 2024年10月07日 星期一
登录注册

文章基本信息

  • 标题:Comparative Analysis of Classification Algorithms for Email Spam Detection
  • 本地全文:下载
  • 作者:Shafi’i Muhammad Abdulhamid ; Maryam Shuaib ; Oluwafemi Osho
  • 期刊名称:International Journal of Computer Network and Information Security
  • 印刷版ISSN:2074-9090
  • 电子版ISSN:2231-4946
  • 出版年度:2018
  • 卷号:10
  • 期号:1
  • 页码:60-67
  • DOI:10.5815/ijcnis.2018.01.07
  • 出版社:MECS Publisher
  • 摘要:The increase in the use of email in every day transactions for a lot of businesses or general communication due to its cost effectiveness and efficiency has made emails vulnerable to attacks including spamming. Spam emails also called junk emails are unsolicited messages that are almost identical and sent to multiple recipients randomly. In this study, a performance analysis is done on some classification algorithms including: Bayesian Logistic Regression, Hidden Na?ve Bayes, Radial Basis Function (RBF) Network, Voted Perceptron, Lazy Bayesian Rule, Logit Boost, Rotation Forest, NNge, Logistic Model Tree, REP Tree, Na?ve Bayes, Multilayer Perceptron, Random Tree and J48. The performance of the algorithms were measured in terms of Accuracy, Precision, Recall, F-Measure, Root Mean Squared Error, Receiver Operator Characteristics Area and Root Relative Squared Error using WEKA data mining tool. To have a balanced view on the classification algorithms’ performance, no feature selection or performance boosting method was employed. The research showed that a number of classification algorithms exist that if properly explored through feature selection means will yield more accurate results for email classification. Rotation Forest is found to be the classifier that gives the best accuracy of 94.2%. Though none of the algorithms did not achieve 100% accuracy in sorting spam emails, Rotation Forest has shown a near degree to achieving most accurate result.
  • 关键词:Email spam;classification algorithms;Bayesian Logistic Regression;Hidden Na?ve Bayes;Rotation Forest
国家哲学社会科学文献中心版权所有