首页    期刊浏览 2024年09月18日 星期三
登录注册

文章基本信息

  • 标题:Development of An External Cluster Validity Index using Probabilistic Approach and Min-max Distance
  • 本地全文:下载
  • 作者:Abhay Kumar Alok ; Sriparna Saha ; Asif Ekbal
  • 期刊名称:International Journal of Computer Information Systems and Industrial Management Applications
  • 印刷版ISSN:2150-7988
  • 电子版ISSN:2150-7988
  • 出版年度:2014
  • 期号:6
  • 页码:494-504
  • 出版社:Machine Intelligence Research Labs (MIR Labs)
  • 摘要:Validating a given clustering result is a very chal- lenging task in real world. So for this purpose, several clus- ter validity indices have been developed in the literature. Clus- ter validity indices are divided into two main categories: ex- ternal and internal. External cluster validity indices rely on some supervised information available and internal validity in- dices utilize the intrinsic structure of the data. In this paper a new external cluster validity index, MMI and its normalized version NMMI have been implemented based on Max-Min dis- tance along data points and prior information using structure of data. A new probabilistic approach has been implemented to find the correct correspondence between the true and obtained clustering. Different possibilities for probabilistic approaches have been considered and tried to rectify their problems. Ge- netic K-means clustering algorithm (GAK-means) and single linkage clustering technique have been used as the underlying clustering techniques. Results of proposed index for classifying the true partitioning results have been shown for six artificial and two real-life data sets. GAK-means and single linkage clus- tering techniques are used as the underlying partitioning tech- niques with the number of clusters varied in a range. The MMI and NMMI index are then used to determine the appropriate number of clusters. Performance of MMI along with its two ver- sions MMI old and MMI new along with its normalized version NMMI are compared with the existing external cluster valid- ity indices, F-measure, purity, normalized mutual information (NMI), rand index (RI), adjusted rand index (ARI). Proposed MMI index works well for two class and multi class data sets.
  • 关键词:Cluster validity; External cluster validity index; Ge- ; netic K-means clustering algorithm; Single linkage clustering
国家哲学社会科学文献中心版权所有