首页    期刊浏览 2024年11月29日 星期五
登录注册

文章基本信息

  • 标题:Identifying inaccuracies in gene expression estimates from unstranded RNA-seq data
  • 本地全文:下载
  • 作者:Mikhail Pomaznoy ; Ashu Sethi ; Jason Greenbaum
  • 期刊名称:Scientific Reports
  • 电子版ISSN:2045-2322
  • 出版年度:2019
  • 卷号:9
  • 期号:1
  • 页码:1-10
  • DOI:10.1038/s41598-019-52584-w
  • 出版社:Springer Nature
  • 摘要:RNA-seq methods are widely utilized for transcriptomic profiling of biological samples. However, there are known caveats of this technology which can skew the gene expression estimates. Specifically, if the library preparation protocol does not retain RNA strand information then some genes can be erroneously quantitated. Although strand-specific protocols have been established, a significant portion of RNA-seq data is generated in non-strand-specific manner. We used a comprehensive stranded RNA-seq dataset of 15 blood cell types to identify genes for which expression would be erroneously estimated if strand information was not available. We found that about 10% of all genes and 2.5% of protein coding genes have a two-fold or higher difference in estimated expression when strand information of the reads was ignored. We used parameters of read alignments of these genes to construct a machine learning model that can identify which genes in an unstranded dataset might have incorrect expression estimates and which ones do not. We also show that differential expression analysis of genes with biased expression estimates in unstranded read data can be recovered by limiting the reads considered to those which span exonic boundaries. The resulting approach is implemented as a package available at https://github.com/mikpom/uslcount .
国家哲学社会科学文献中心版权所有