文章基本信息

标题：Implementation of Hybrid Clustering Algorithm with Enhanced K-Means and Hierarchal Clustering
本地全文：下载
作者：Gurjit Singh ; Navjot Kaur
期刊名称：International Journal of Advanced Research In Computer Science and Software Engineering
印刷版ISSN：2277-6451
电子版ISSN：2277-128X
出版年度：2013
卷号：3
期号：8
出版社：S.S. Mishra
摘要：We are propose a hybrid clustering method, the methodology combines the strengths of both partitioning and agglomerative clustering methods. Clustering algorithms that build meaningful hierarchies out of large document collections are ideal tools for their interactive visualization and exploration as they provide data-views that are consistent, predictable, and at different levels of granularity. This paper focuses on text clustering algorithms that build such hierarchical solutions and presents a comprehensive study of partitional and agglomerative algorithms that use different criterion functions and merging schemes. Which combine features from both partitional and agglomerative approaches that allow them to reduce the early-stage errors made by agglomerative methods and hence improve the quality of clustering solutions. Clustering algorithms are used to organize data, categorize data, for data compression and model construction, for detection of outliers etc. we are proposed a way to carry out fast hierarchical clustering of large text or document datasets by using a search engine. This work makes an attempt at studying the feasibility of K-means clustering algorithm in data mining using the Hierachal clustering .They have shown that such a strategy exploits the strengths of both algorithm and leads to better solutions. These algorithms are provide better accuracy and stability. These algorithms are suitable for text clustering .we have calculated f-measure based on the precision and recall then produce better clustering result. We have evaluated an incremental hierarchical clustering algorithm, which is often used with non-text datasets .Data Mining process which is used for the purpose to make groups or clusters of the given data set based on the similarity between them. K-Means clustering is a clustering method in which the given data set is divided into K number of clusters. This paper is intended to give the introduction about K-means clustering, Hierachal clustering and its algorithm.. Clustering algorithms are usually completely partitional or completely agglomerative in nature. Fast and high-quality document clustering algorithms play an important role in providing intuitive navigation and browsing mechanisms by organizing large amounts of information into a small number of meaningful clusters. The main experimental result is produce better clustering and reduces CPU utilization by using hybrid approach
关键词：Clustering; text clustering; Hierarchical Clustering; K-means; som