首页    期刊浏览 2024年11月29日 星期五
登录注册

文章基本信息

  • 标题:Machine Learning Algorithms for Document Classification: Comparative Analysis
  • 本地全文:下载
  • 作者:Faizur Rashid ; Suleiman M. A. Gargaare ; Abdulkadir H. Aden
  • 期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
  • 印刷版ISSN:2158-107X
  • 电子版ISSN:2156-5570
  • 出版年度:2022
  • 卷号:13
  • 期号:4
  • DOI:10.14569/IJACSA.2022.0130430
  • 语种:English
  • 出版社:Science and Information Society (SAI)
  • 摘要:Automated document classification is the machine learning fundamental that refers to assigning automatic categories among scanned images of the documents. It reached the state-of-art stage but it needs to verify the performance and efficiency of the algorithm by comparing. The objective was to get the most efficient classification algorithms according to the usage of the fundamentals of science. Experimental methods were used by collecting data from a sum of 1080 students and researchers from Ethiopian universities and a meta-data set of Banknotes, Crowdsourced Mapping, and VxHeaven provided by UC Irvine. 25% of the respondents felt that KNN is better than the other models. The overall analysis of performance accuracies through various parameters namely accuracy percentage of 99.85%, the precision performance of 0.996, recall ratio of 100%, F-Score 0.997, classification time, and running time of KNN, SVM, Perceptron and Gaussian NB was observed. KNN performed better than the other classification algorithms with a fewer error rate of 0.0002 including the efficiency of the least classification time and running time with ~413 and 3.6978 microseconds consecutively. It is concluded by looking at all the parameters that KNN classifiers have been recognized as the best algorithm.
  • 关键词:Document classification; machine learning algorithms; text classification; analysis
国家哲学社会科学文献中心版权所有