文章基本信息

标题：Re-Defining K-Means by Increasing Number of Clusters
本地全文：下载
作者：Priyanka Kohli ; Dr. Harsh Sadawarti
期刊名称：International Journal of Advanced Research In Computer Science and Software Engineering
印刷版ISSN：2277-6451
电子版ISSN：2277-128X
出版年度：2013
卷号：3
期号：8
出版社：S.S. Mishra
摘要：Data Mining is the process of extracting previously unknown but significant information from large databases. It is also termed as the exploration and analysis of large quantities of data in order to discover valid, novel, potentially useful, and ultimately understandable patterns in data. Data and Information or Knowledge has a significant role on human activities. Data mining is the knowledge discovery process by analyzing the large volumes of data from various perspectives and summarizing it into useful information. To analyze, manage and make a decision of such type of huge amount of data we need techniques called the data mining.. Valid: The patterns hold in general. Novel: We did not know the pattern beforehand. Useful: We can devise actions from the patterns. Understandable: We can interpret and comprehend the patterns. In this paper first the concept of data mining (DM) is explained and aims at providing an understanding of the overall process and tools involved: how the process turns out, what can be done with it, what are the main techniques behind it [1], which are the operational aspects. Then brief description of different techniques of DM by taking few examples, and then re-defining k-means algorithm of clustering for increase in number of clusters and value of k
关键词：Data Mining; KDD; Classification; Clustering; Tanagra Tool