首页    期刊浏览 2024年12月01日 星期日
登录注册

文章基本信息

  • 标题:Mapping-friendly sequence reductions: Going beyond homopolymer compression
  • 本地全文:下载
  • 作者:Luc Blassel ; Paul Medvedev ; Rayan Chikhi
  • 期刊名称:iScience
  • 印刷版ISSN:2589-0042
  • 出版年度:2022
  • 卷号:25
  • 期号:11
  • 页码:1-16
  • DOI:10.1016/j.isci.2022.105305
  • 语种:English
  • 出版社:Elsevier
  • 摘要:SummarySequencing errors continue to pose algorithmic challenges to methods working with sequencing data. One of the simplest and most prevalent techniques for ameliorating the detrimental effects of homopolymer expansion/contraction errors present in long reads is homopolymer compression. It collapses runs of repeated nucleotides, to remove some sequencing errors and improve mapping sensitivity. Though our intuitive understanding justifies why homopolymer compression works, it in no way implies that it is the best transformation that can be done. In this paper, we explore if there are transformations that can be applied in the same pre-processing manner as homopolymer compression that would achieve better alignment sensitivity. We introduce a more general framework than homopolymer compression, called mapping-friendly sequence reductions. We transform the reference and the reads using these reductions and then apply an alignment algorithm. We demonstrate that some mapping-friendly sequence reductions lead to improved mapping accuracy, outperforming homopolymer compression.Graphical abstractDisplay OmittedHighlights•Mapping-friendly sequence reductions (MSRs) are functions that transform DNA sequences•They are a generalization of the concept of homopolymer compression•We show that some well-chosen MSRs enable more accurate long-read mappingBiological sciences; Molecular biology; Biological sciences research methodologies; Transcriptomics
国家哲学社会科学文献中心版权所有