期刊名称:International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
印刷版ISSN:2278-1323
出版年度:2013
卷号:2
期号:11
页码:2972-2977
出版社:Shri Pannalal Research Institute of Technolgy
摘要:In today's world, an organization generates more information in a week than most people can read in a lifetime. The amount of raw data stored in databases is exploding. Cluster analysis is one of the major data mining methods and the k-means clustering algorithm is widely used for many applications. K-means algorithm is computationally expensive and the quality of the resulting clusters depends on the choice of initial centroids. This paper proposes an improvement on the classic k-means algorithm to produce more accurate clusters. The proposed algorithm comprises of method, based on sorting and partitioning the input data, for finding the initial centroids in accordance with the data distribution. Experimental results show that the proposed algorithm produces better clusters in less computation time.
关键词:Clustering; Data Mining; Initial Centroid; ; K-Mean; Median.