期刊名称:International Journal of Computer Science and Information Technologies
电子版ISSN:0975-9646
出版年度:2014
卷号:5
期号:3
页码:4175-4180
出版社:TechScience Publications
摘要:With the ever increasing volume of information, document clustering is used for automatic document organization so as to yield relevant information in an expeditious manner. Document clustering is an automatic grouping of text documents into clusters so that documents within a cluster have similar concepts. Representation of document is a very important step in any Information Retrieval (IR) system. In traditional document representation methods, the feature vector representing the document is constructed from the frequency count of document terms. But traditional document representation methods can not identify semantically related terms. In this paper, we present a semantic document clustering method that uses Universal Networking Language(UNL) and Particle Swarm Optimization(PSO). We generate feature vectors using UNL. The hybrid PSO+K-means algorithm is used to cluster the documents. Some experiments are performed to compare efficiency of the UNL method with the traditional term frequency based method. The results obtained show that the PSO-based clustering method using the UNL performs better than the term frequency based Method