出版社:University of Malaya * Faculty of Computer Science and Information Technology
摘要:In this paper, we present a new procedure for detecting clusters within unlabelled data sets of the form X = {x1, x2,…,xn} Rp. This procedure quickly explores the elements of X with the main goal of discovering the clusters they form. It provides, in addition to the number of clusters, an initial prototype of each detected cluster. For this, the only assumptions made are that (1) the two least similar elements of belong necessarily to two different clusters, and (2) each element possesses a level of similarity with its nearest prototype greater than a certain threshold. This threshold can be either user defined or automatically determined by the algorithm using a validation process. The effectiveness of this method is demonstrated on both synthetic and real test data sets.