A toolbox for improving population genomes.
<b>[This project is in active development and not currently recommended for public use.]</b>
CleaM is a set of tools for improving population genomes. It provides methods designed to improve the completeness of a genome along with methods for identifying and removing contamination. CleanM comprises only part of a full genome QC pipeline and should be used in conjunction with existing QC tools such as [CheckM](https://github.com/Ecogenomics/CheckM/wiki). The functionality currently planned is:
<i>Improve completeness:</i> * identify contigs with homology to closely related reference genome(s) * identify contigs with compatible GC, coverage, and tetranucleotide signatures * indetify partial population genomes which should be merged together (requires[CheckM](https://github.com/Ecogenomics/CheckM/wiki))
<i>Reducing contamination:</i> * taxonomically classify contigs within a genome in order to identify outliers * identify contigs with divergent GC content, coverage, or tetranucleotide signatures * identify contigs with a coding density suggestive of a Eukaryotic origin
The simplest way to install this package is through pip: > sudo pip install cleanm
This package requires numpy to be installed and makes use of the follow bioinformatic packages:
CheckM relies on several other software packages:
If you find this package useful, please cite this git repository (https://github.com/dparks1134/CleanM)
Copyright © 2015 Donovan Parks. See LICENSE for further details.