期刊名称:International Journal of Computer Trends and Technology
电子版ISSN:2231-2803
出版年度:2014
卷号:7
期号:4
DOI:10.14445/22312803/IJCTT-V7P109
出版社:Seventh Sense Research Group
摘要:Text classification is one of the most important tasks in data mining. This paper investigates different variations of vector space models (VSMs) using KNN algorithm. The bases of our comparison are the most popular text evaluation measures. The Experimental results against the Saudi data sets reveal that Cosine outperformed Dice and Jaccard coefficients.
关键词:Arabic data sets; Data mining; Text categorization; Term weighting; VSM.