Skip to main content

A toolbox for improving population genomes.

Project description

# CleanM

<b>[This project is in active development and not currently recommended for public use.]</b>

CleaM is a set of tools for improving population genomes. It provides methods designed to improve the completeness of a genome along with methods for identifying and removing contamination. CleanM comprises only part of a full genome QC pipeline and should be used in conjunction with existing QC tools such as [CheckM](https://github.com/Ecogenomics/CheckM/wiki). The functionality currently planned is:

<i>Improve completeness:</i> * identify contigs with homology to closely related reference genome(s) * identify contigs with compatible GC, coverage, and tetranucleotide signatures * indetify partial population genomes which should be merged together (requires[CheckM](https://github.com/Ecogenomics/CheckM/wiki))

<i>Reducing contamination:</i> * taxonomically classify contigs within a genome in order to identify outliers * identify contigs with divergent GC content, coverage, or tetranucleotide signatures * identify contigs with a coding density suggestive of a Eukaryotic origin

## Install

The simplest way to install this package is through pip: > sudo pip install cleanm

This package requires numpy to be installed and makes use of the follow bioinformatic packages:

CheckM relies on several other software packages:

## Cite

If you find this package useful, please cite this git repository (https://github.com/dparks1134/CleanM)

## Copyright

Copyright © 2015 Donovan Parks. See LICENSE for further details.

Project details


Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page