首页    期刊浏览 2025年02月22日 星期六
登录注册

文章基本信息

  • 标题:Rainbowfish: A Succinct Colored de Bruijn Graph Representation
  • 本地全文:下载
  • 作者:Fatemeh Almodaresi ; Prashant Pandey ; Rob Patro
  • 期刊名称:LIPIcs : Leibniz International Proceedings in Informatics
  • 电子版ISSN:1868-8969
  • 出版年度:2017
  • 卷号:88
  • 页码:18:1-18:15
  • DOI:10.4230/LIPIcs.WABI.2017.18
  • 出版社:Schloss Dagstuhl -- Leibniz-Zentrum fuer Informatik
  • 摘要:The colored de Bruijn graph- a variant of the de Bruijn graph which associates each edge (i.e., k-mer) with some set of colors - is an increasingly important combinatorial structure in computational biology. Iqbal et al. demonstrated the utility of this structure for representing and assembling a collection (population) of genomes, and showed how it can be used to accurately detect genetic variants. Muggli et al. introduced VARI, a representation of the colored de Bruijn graph that adopts the BOSS representation for the de Bruijn graph topology and achieves considerable savings in space over Cortex, albeit with some sacrifice in speed. The memory-efficient representation of VARI allows the colored de Bruijn graph to be constructed and analyzed for large datasets, beyond what is possible with Cortex. In this paper, we introduce Rainbowfish, a succinct representation of the color information of the colored de Bruijn graph that reduces the space usage even further. Our representation also uses BOSS to represent the de Bruijn graph, but decomposes the color sets based on an equivalence relation and exploits the inherent skewness in the distribution of these color sets. The Rainbowfish representation is compressed based on the 0th-order entropy of the color sets, which can lead to a significant reduction in the space required to store the relevant information for each edge. In practice, Rainbowfish achieves up to a 20x improvement in space over VARI. Rainbowfish is written in C++11 and is available at https://github.com/COMBINE-lab/rainbowfish.
  • 关键词:de Bruijn graph; succinct data structures; rank and select operation; colored de Bruijn graph
国家哲学社会科学文献中心版权所有