首页    期刊浏览 2024年10月07日 星期一
登录注册

文章基本信息

  • 标题:Experiments on the Use of Machine LearningClassification Methods in Online Crime TextFiltering and Classification
  • 本地全文:下载
  • 作者:Fadl Mutaher Ba-Alwi ; Mohammed Albared
  • 期刊名称:Current Journal of Applied Science and Technology
  • 印刷版ISSN:2457-1024
  • 出版年度:2015
  • 卷号:12
  • 期号:5
  • 页码:1-12
  • 语种:English
  • 出版社:Sciencedomain International
  • 摘要:With the exponential growth of textual information available from the Internet, there has been an emergent need to find relevant and in-time knowledge about crimes from this huge size of information. The huge size of such data makes the process of retrieving and analyzing texts manually a very difficult task. Furthermore, domain-specific documents classification is a hard task and suffers from low classification efficiency due to overlapping among domain subclasses. This work is focused on finding an appropriate classification model for crime domain-specific knowledge on the Web. To do that, the two-level classification method for online crime text filtering and classification is used. In each level, three feature selection methods (Gini Index, Chi-square statistic and Information gain) and three learning methods (K-nearest neighbor, Naive Bayes and support vector machine (SVM)) are investigated. The experimental results in the first level indicate that Information gain feature selection method performs the best for crime terms selection and both SVM and NB exhibit the best performance for crime text filtering. Furthermore, the experimental results in the second  level indicate that Gini index feature selection method performs the best for crime types terms selection and SVM classifier exhibits the best performance on classifying crime documents into their appropriate crime types.
  • 关键词:Crime data mining;web mining;focused crawling;classification
国家哲学社会科学文献中心版权所有