首页    期刊浏览 2024年09月20日 星期五
登录注册

文章基本信息

  • 标题:Hierarchical Clustering Based on K-Means as Local Sample (HCKM)
  • 本地全文:下载
  • 作者:Ahmed Fahim ; Ahmed Fahim
  • 期刊名称:Computer Sciences and Telecommunications
  • 印刷版ISSN:1512-1232
  • 出版年度:2007
  • 期号:03
  • 页码:89-101
  • 出版社:Internet Academy
  • 摘要:Clustering is useful for discovering groups and identifying interesting distributions in the underlying data. Traditional clustering algorithms either favor clusters with spherical shapes and similar sizes, or are very fragile in the presence of outliers. We propose a clustering algorithm called HCKM, that is more robust to outliers and identifies clusters having spherical or non-spherical shapes and wide variances in size. HCKM achieves this by representing each cluster by a number of points that are the means of all smaller sub-clusters forming it. Having more than one representative point per cluster allows HCKM to adjust well to the geometry of non-spherical shapes. Our experimental results confirm that the quality of clusters produced by HCKM is better than those found by existing algorithms; that is because the first phase -that creates sample- is an enhanced procedure for the k-means algorithm, this enable us to remove the outliers . Furthermore, results demonstrate that sampling enable HCKM not only to outperform existing algorithms but also to scale well for large databases without sacrificing clustering quality.
  • 关键词:Hierarchical Clustering, Cluster analysis, Data analysis
国家哲学社会科学文献中心版权所有