期刊名称:International Journal of Computer Science and Information Technologies
电子版ISSN:0975-9646
出版年度:2011
卷号:2
期号:4
页码:1825-1831
出版社:TechScience Publications
摘要:One of the major problems in cluster analysis is the determination of the number of clusters in unlabeled data prior to clustering. In this paper, we implement a new method for determining the number of clusters called Extended Dark Block Extraction (EDBE), which is based on an existing algorithm for Visual Assessment of Cluster Tendency (VAT) of a data set. Its basic steps include 1) Generating a VAT image of an input dissimilarity matrix, 2) Performing image segmentation on the VAT image to obtain a binary image, followed by directional morphological filtering, 3)Applying a distance transform to the filtered binary image and projecting the pixel values onto the main diagonal axis of the image to form a projection signal, 4) Smoothing the projection signal, computing its First-order derivative and then detecting major peaks and valleys in the resulting signal to decide the number of clusters, and 5)The C-Means algorithm is applied to the major peaks. We also implement the Extended Cluster Count Extraction (ECCE), which uses VAT and the combination of several image processing techniques. In both the methods we use Reordered Dissimilarity Image (RDI), which highlights potential clusters as a set of “Dark blocks” along the diagonal of the image, corresponding to sets of objects with low dissimilarity, which is implemented using VAT algorithm. This paper develops a new method for automatically estimating the number of dark blocks in RDI’s unlabelled data sets and compares the two methods EDBE and ECCE for determining the number of clusters in unlabelled data sets. fault