文章基本信息

标题：PanCake: A Data Structure for Pangenomes
作者：Corinna Ernst ; Sven Rahmann
期刊名称：OASIcs : OpenAccess Series in Informatics
电子版ISSN：2190-6807
出版年度：2013
卷号：34
页码：35-45
DOI：10.4230/OASIcs.GCB.2013.35
出版社：Schloss Dagstuhl -- Leibniz-Zentrum fuer Informatik
摘要：We present a pangenome data structure ("PanCake") for sets of related genomes, based on bundling similar sequence regions into shared features, which are derived from genome-wide pairwise sequence alignments. We discuss the design of the data structure, basic operations on it and methods to predict core genomes and singleton regions. In contrast to many other pangenome analysis tools, like EDGAR or PGAT, PanCake is independent of gene annotations. Nevertheless, comparison of identified core and singleton regions shows good agreements. The PanCake data structure requires significantly less space than the sum of individual sequence files.
关键词：pangenome; data structure; core genome; comparative genomics