首页    期刊浏览 2024年12月01日 星期日
登录注册

文章基本信息

  • 标题:Eradicating Data Duplication of the Clustering Result Using WTQ
  • 本地全文:下载
  • 作者:P.Vasanthi ; Ch.Swapna Priya ; P. Suresh Babu
  • 期刊名称:International Journal of Computer Science & Technology
  • 印刷版ISSN:2229-4333
  • 电子版ISSN:0976-8491
  • 出版年度:2013
  • 卷号:4
  • 期号:3
  • 页码:557-560
  • 语种:English
  • 出版社:Ayushmaan Technologies
  • 摘要:The data generated by conventional categorical data clustering is incomplete because the information provided is also incomplete. This project presents a new link-based approach, which improves the categorical clustering by discovering unknown entries through similarity between clusters in an ensemble. A graph partitioning technique is applied to a weighted bipartite graph to obtain the final clustering result. It plays a crucial, foundation role in machine learning, data mining, information retrieval and pattern recognition. The experimental results on multiple real data sets suggest that the proposed link-based method almost always outperforms both conventional clustering algorithms for categorical data and well-known cluster ensemble technique. This paper proposing an Algorithm called Weighted Triple- Quality (WTQ), which also uses k-means algorithm for basic clustering To introduce a minhash algorithm to avoid the data duplication in different cluster and also Secure Information Retrieval (SIR )data from the final cluster ensemble result.
  • 关键词:Clustering;Categorical Data;Cluster Ensembles;Link-Based Similarity;Data Mining
国家哲学社会科学文献中心版权所有