期刊名称:International Journal of Engineering and Computer Science
印刷版ISSN:2319-7242
出版年度:2013
卷号:2
期号:5
页码:1568-1571
出版社:IJECS
摘要:Data mining is the process of sorting through large database or data warehouse and extracting knowledge interested by the people. The extracted knowledge may be represented as concept, rule, law and model.Clustering is one such technique in data mining that partitions the data in meaningful clusters so that the distances of objects in the same cluster is as small as possible.Among the wide range of clustering algorithms,k-means is one of the most popular clustering algorithms. This paper presents animproved k-means algorithm using Euclidean distance method. The intra-cluster error criterion function minimizes significantly using the improved k-means algorithm. Moreover, the distribution of the data objects also improved and the results are verified over two datasets namely- letter image recognition dataset and the seeds datasets using improved k-means algorithm. The effectiveness of the algorithm is shown by comparing the results over standard k-means algorithm and the improved k-means algorithm
关键词:Data mining;Clustering;K-means algorithm; Euclidean distance; Criterion Function