期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2014
卷号:69
期号:3
出版社:Journal of Theoretical and Applied
摘要:Data mining is a collection of techniques used to extract useful information from large data bases. Data clustering is a popular data mining technique. It is the task of grouping a set of objects into classes such that similar objects are placed in the same cluster while dissimilar objects are in separate clusters. Fuzzy c-means (FCM) is one of the most popular clustering algorithms. However, it has some limitations such as sensitivity to initialization and getting struck at local optimal values. Swarm intelligence algorithms are global optimization techniques and are recently successfully applied to solve many real-world optimization problems. Constriction Factor Particle Swarm Optimization (cfPSO) algorithm is a population based global optimization technique which is used to solve data clustering problems. Euclidean distance is a well known and commonly used metric in most of the literature. Some drawbacks of this distance metric include blind to correlated variables, not robust in noisy environment, affected by outlier data points and handle data sets with only equal size, density and spherical shapes. But real-world data sets may exhibit different shapes. In this paper, a Fuzzy based Constriction Factor PSO (FUZZY-cfPSO-FCM) algorithm is proposed using Non-Euclidean distance metrics such as Kernel, Mahalanobis and New distance on several benchmark UCI machine learning repository data sets. The proposed hybrid algorithm makes use of the advantages of FCM and cfPSO algorithms. The clustering results are also evaluated through fitness value, accuracy rate and failure rate. Experimental results show that proposed hybrid algorithm achieves better result on various data sets.