A Data Structure for Pangenomes -- Identification of Singletons and Core Regions Dependent on Pairwise Sequence Similarities
PanCake – An annotation-independent non-database approach for pangenome analysis
PanCake is a non-databse approach for identification of singletons and core regions in arbitrary sequence sets and independent of annotation data. On the basis of pairwise sequence similarities (as provided by common sequence alignment tools) PanCake serializes pangenomes into persistent data structures providing further analysis and graphical representation. Due to sequence storage of similar subregions as edit operations, storage requirement of PanCake’s data structure is noticeably reduced in comparison to pure sequence data (e.g. FASTA format).
Copyright (c) 2013 Corinna Ernst <email@example.com> (see LICENSE)