Skip to main content
Join the official Python Developers Survey 2018 and win valuable prizes: Start the survey!

EMIRGE reconstructs full length sequences from short sequencing reads

Project description

EMIRGE: Expectation-Maximization Iterative Reconstruction of Genes
from the Environment

EMIRGE reconstructs full length ribosomal genes from short read sequencing data. In the process, it also provides estimates of the sequences’ abundances.

EMIRGE uses a modification of the EM algorithm to iterate between estimating the expected value of the abundance of all SSU sequences present in a sample and estimating the probabilities for each read that a specific sequence generated that read. At the end of each iteration, those probabilities are used to re-calculate (correct) a consensus sequence for each reference SSU sequence, and the mapping is repeated, followed by the estimations of probabilities. The iterations should usually stop when the reference sequences no longer change from one iteration to the next. Practically, 40-80 iterations is usually sufficient for many samples. Right now EMIRGE uses Bowtie alignments internally, though in theory a different mapper could be used.

EMIRGE was designed for Illumina reads in FASTQ format, from pipeline version >= 1.3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
EMIRGE-0.61.1.tar.gz (255.8 kB) Copy SHA256 hash SHA256 Source None Dec 3, 2016

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page