首页    期刊浏览 2024年11月25日 星期一
登录注册

文章基本信息

  • 标题:DATA CLUSTERING BASED ON HYBRID OF FUZZY AND SWARM INTELLIGENCE ALGORITHM USING EUCLIDEAN AND NON-EUCLIDEAN DISTANCE METRICS: A COMPARATIVE STUDY
  • 本地全文:下载
  • 作者:O.A. MOHAMED JAFAR ; R.SIVAKUMAR
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2014
  • 卷号:69
  • 期号:3
  • 出版社:Journal of Theoretical and Applied
  • 摘要:Data mining is a collection of techniques used to extract useful information from large data bases. Data clustering is a popular data mining technique. It is the task of grouping a set of objects into classes such that similar objects are placed in the same cluster while dissimilar objects are in separate clusters. Fuzzy c-means (FCM) is one of the most popular clustering algorithms. However, it has some limitations such as sensitivity to initialization and getting struck at local optimal values. Swarm intelligence algorithms are global optimization techniques and are recently successfully applied to solve many real-world optimization problems. Constriction Factor Particle Swarm Optimization (cfPSO) algorithm is a population based global optimization technique which is used to solve data clustering problems. Euclidean distance is a well known and commonly used metric in most of the literature. Some drawbacks of this distance metric include blind to correlated variables, not robust in noisy environment, affected by outlier data points and handle data sets with only equal size, density and spherical shapes. But real-world data sets may exhibit different shapes. In this paper, a Fuzzy based Constriction Factor PSO (FUZZY-cfPSO-FCM) algorithm is proposed using Non-Euclidean distance metrics such as Kernel, Mahalanobis and New distance on several benchmark UCI machine learning repository data sets. The proposed hybrid algorithm makes use of the advantages of FCM and cfPSO algorithms. The clustering results are also evaluated through fitness value, accuracy rate and failure rate. Experimental results show that proposed hybrid algorithm achieves better result on various data sets.
  • 关键词:Data Clustering; Fuzzy c-means (FCM); Swarm Intelligence (SI); Constriction Factor Particle Swarm Optimization (cfPSO); Euclidean Distance Metric; Non-Euclidean Distance Metrics
国家哲学社会科学文献中心版权所有