首页    期刊浏览 2024年09月20日 星期五
登录注册

文章基本信息

  • 标题:Divide and ConquerMethod for Clustering Mixed Numerical and Categorical Data
  • 本地全文:下载
  • 作者:Dileep Kumar Murala
  • 期刊名称:International Journal of Computer Science and Information Technologies
  • 电子版ISSN:0975-9646
  • 出版年度:2013
  • 卷号:4
  • 期号:1
  • 页码:103-106
  • 出版社:TechScience Publications
  • 摘要:Clustering is a challenging task in data mining technique. The aim of clustering is to group the similar data into number of clusters. Various clustering algorithms have been developed to group data into clusters. The main aim of cluster analysis is to assign objects into groups (clusters) in such a way that two objects from the same cluster are more similar than two objects from different clusters. Various clustering algorithms have been developed to group data into clusters in diverse domains. However, these clustering algorithms work effectively either on pure numeric data or on pure categorical data, most of them perform poorly on mixed categorical and numeric data types. In this paper we cluster the mixed numeric and categorical data set in efficient manner. In this paper, we propose a divide-and-conquer technique to solve this problem. First, the original mixed dataset is divided into two sub-datasets: the pure categorical dataset and the pure numeric dataset. Next, existing well established clustering algorithms designed for different types of datasets are employed to produce corresponding clusters. Last, the clustering results on the categorical and numeric dataset are combined as a categorical dataset, on which the categorical data clustering algorithm is used to get the final clusters.
  • 关键词:clustering; novel divide-and-conquer; mixed;dataset; Numerical data; and categorical data.
国家哲学社会科学文献中心版权所有