首页    期刊浏览 2024年12月01日 星期日
登录注册

文章基本信息

  • 标题:Gene coexpression measures in large heterogeneous samples using count statistics
  • 本地全文:下载
  • 作者:Y. X. Rachel Wang ; Michael S. Waterman ; Haiyan Huang
  • 期刊名称:Proceedings of the National Academy of Sciences
  • 印刷版ISSN:0027-8424
  • 电子版ISSN:1091-6490
  • 出版年度:2014
  • 卷号:111
  • 期号:46
  • 页码:16371-16376
  • DOI:10.1073/pnas.1417128111
  • 语种:English
  • 出版社:The National Academy of Sciences of the United States of America
  • 摘要:SignificanceCoexpression analysis is one of the earliest tools for inferring gene associations using expression data but faces new challenges in this "big data" era. In a large heterogeneous dataset, it is likely that gene relationships may change or only exist in a subset of the samples, and they can be nonlinear or nonfunctional. We propose two new robust count statistics to account for local patterns in gene expression profiles. The statistics are generalizable to detect statistical dependence in other application domains. The performance of the statistics is evaluated against a number of popular bivariate dependence measures, showing favorable results. The asymptotic studies of the statistics provide an interesting addition to the combinatorics literature. With the advent of high-throughput technologies making large-scale gene expression data readily available, developing appropriate computational tools to process these data and distill insights into systems biology has been an important part of the "big data" challenge. Gene coexpression is one of the earliest techniques developed that is still widely in use for functional annotation, pathway analysis, and, most importantly, the reconstruction of gene regulatory networks, based on gene expression data. However, most coexpression measures do not specifically account for local features in expression profiles. For example, it is very likely that the patterns of gene association may change or only exist in a subset of the samples, especially when the samples are pooled from a range of experiments. We propose two new gene coexpression statistics based on counting local patterns of gene expression ranks to take into account the potentially diverse nature of gene interactions. In particular, one of our statistics is designed for time-course data with local dependence structures, such as time series coupled over a subregion of the time domain. We provide asymptotic analysis of their distributions and power, and evaluate their performance against a wide range of existing coexpression measures on simulated and real data. Our new statistics are fast to compute, robust against outliers, and show comparable and often better general performance.
  • 关键词:local rank patterns ; bivariate association ; random permutation statistics ; Stein's approximation
国家哲学社会科学文献中心版权所有