首页    期刊浏览 2024年10月04日 星期五
登录注册

文章基本信息

  • 标题:Learning to Detect Spam: Naive-Euclidean Approach
  • 本地全文:下载
  • 作者:Tony Y.T. Chan ; Jie Ji ; Qiangfu Zhao
  • 期刊名称:International Journal of Signal Processing, Image Processing and Pattern Recognition
  • 印刷版ISSN:2005-4254
  • 出版年度:2008
  • 卷号:1
  • 期号:1
  • 出版社:SERSC
  • 摘要:A method is proposed for learning to classify spam and nonspam emails. It combines the strategy of the Best Stepwise Feature Selection with a classifier of Euclidean nearest-neighbor. Each text email is first transformed into a vector of D-dimensional Euclidean space. Emails were divided into training and test sets in the manner of 10-fold crossvalidation. Three experiments were performed, and their elapsed CPU times and accuracies reported. The proposed spam detection learner was found to be extremely fast in recognition and with good error rates. It could be used as a baseline learning agent, in terms of CPU time and accuracy, against which other learning agents can be measured.
  • 关键词:Spam email detection; machine learning; feature selection;Euclidean vector space; 10-fold cross-validation; nearest-neighbor classifiers
国家哲学社会科学文献中心版权所有