首页    期刊浏览 2024年11月23日 星期六
登录注册

文章基本信息

  • 标题:Incorporating heterogeneous biological data sources in clustering gene expression data
  • 本地全文:下载
  • 作者:Gang-Guo Li ; Zheng-Zhi Wang
  • 期刊名称:Health
  • 印刷版ISSN:1949-4998
  • 电子版ISSN:1949-5005
  • 出版年度:2009
  • 卷号:1
  • 期号:1
  • 页码:17-23
  • DOI:10.4236/health.2009.11004
  • 出版社:Scientific Research Publishing
  • 摘要:In this paper, a similarity measure between genes with protein-protein interactions is pro-posed. The chip-chip data are converted into the same form of gene expression data with pear-son correlation as its similarity measure. On the basis of the similarity measures of protein- protein interaction data and chip-chip data, the combined dissimilarity measure is defined. The combined distance measure is introduced into K-means method, which can be considered as an improved K-means method. The improved K-means method and other three clustering methods are evaluated by a real dataset. Per-formance of these methods is assessed by a prediction accuracy analysis through known gene annotations. Our results show that the improved K-means method outperforms other clustering methods. The performance of the improved K-means method is also tested by varying the tuning coefficients of the combined dissimilarity measure. The results show that it is very helpful and meaningful to incorporate het-erogeneous data sources in clustering gene expression data, and those coefficients for the genome-wide or completed data sources should be given larger values when constructing the combined dissimilarity measure.
  • 关键词:Statistical Analysis; Similarity/ Dissimilarity Measure; Gene Expression Data; Clustering; Data Fusion
国家哲学社会科学文献中心版权所有