首页    期刊浏览 2025年06月12日 星期四
登录注册

文章基本信息

  • 标题:A space-structure based dissimilarity measure for categorical data
  • 本地全文:下载
  • 作者:Kevin Alejandro Hernández ; D. Cárdenas Peña ; Álvaro A. Orozco
  • 期刊名称:International Journal of Electrical and Computer Engineering
  • 电子版ISSN:2088-8708
  • 出版年度:2021
  • 卷号:11
  • 期号:1
  • 页码:620
  • DOI:10.11591/ijece.v11i1.pp620-627
  • 出版社:Institute of Advanced Engineering and Science (IAES)
  • 摘要:The development of analysis methods for categorical data begun in 90's decade, and it has been booming in the last years. On the other hand, the performance of many of these methods depends on the used metric. Therefore, determining a dissimilarity measure for categorical data is one of the most attractive and recent challenges in data mining problems. However, several similarity/dissimilarity measures proposed in the literature have drawbacks due to high computational cost, or poor performance. For this reason, we propose a new distance metric for categorical data. We call it: Weighted pairing (W-P) based on feature space-structure, where the weights are understood like a degree of contribution of an attribute to the compact cluster structure. The performance of W-P metric was evaluated in the unsupervised learning framework in terms of cluster quality index. We test the W-P in six real categorical datasets downloaded from the public UCI repository, and we make a comparison with the distance metric (DM3) method and hamming metric (H-SBI). Results show that our proposal outperforms DM3 and H-SBI in different experimental configurations. Also, the W-P achieves highest rand index values and a better clustering discriminant than the other methods.
国家哲学社会科学文献中心版权所有