首页    期刊浏览 2024年12月02日 星期一
登录注册

文章基本信息

  • 标题:Enhanced Hierarchical Clustering for Genome Databases
  • 作者:Sadiq Hussain Gopal Hazarika
  • 期刊名称:International Journal of Computer Science Issues
  • 印刷版ISSN:1694-0784
  • 电子版ISSN:1694-0814
  • 出版年度:2011
  • 卷号:8
  • 期号:4
  • 出版社:IJCSI Press
  • 摘要:Clustering techniques find interesting and previously unknown patterns in large scale data embedded in a large multi dimensional space and are applied to a wide variety of problems like customer segmentation, Biology, data mining techniques, machine Learning and geographical information systems. Clustering algorithms are used efficiently to scale up with the dimensionality of the data sets and the data base size. Hierarchical clustering methods in particular are widely used to find patterns in multi dimensional data. In this paper, we design an enhanced hierarchical clustering algorithm which scans the dataset and calculates distance matrix only once. Our main contribution is to reduce time, even when a large database is analyzed. Also, the results of hierarchical clustering are represented as a binary tree which gives clarity in grouping and further helps to find clustered objects easily. Our algorithm is able to retrieve number of clusters with the help of cut distance and measures the quality with validation index in order to obtain the best one; does not require initial parameter like number of clusters.
  • 关键词:Micro array; Hierarchical clustering; Gene expression data; Binary Tree
Loading...
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有