期刊名称:International Journal of Computer Science and Information Technologies
电子版ISSN:0975-9646
出版年度:2015
卷号:6
期号:5
页码:4459-4464
出版社:TechScience Publications
摘要:A cluster is a collection of data objects that are similar to one another within the same cluster and are dissimilar to the objects in other clusters. Subspace clustering is an enhanced form of the traditional clustering which is used for identifying clusters in high dimensional data sets. There are two major subspace clustering approaches namely : Top-down approach which use sampling techniques that randomly pick up sample data points to identify the subspace and then assigns all the data points to form original clusters and Bottom-up approach, where dense regions in low dimensional spaces are found and then combined to form clusters. The paper discusses details of the top-down algorithm PROCLUS which is applied for customer segmentation, Trend Analysis, Classification, etc. which needs disjoint partition of datasets and CLIQUE which is used to identify overlapping clusters. The paper highlights the important steps of both the algorithms with flowcharts and an experimental study has been carried out using synthetic data to compare PROCLUS and CLIQUE by varying dimensions of the data set, the size of the data set and the number of clusters.