emase

EMASE: Expectation-Maximization algorithm for Allele Specific Expression

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: GNU General Public License v3 (GPLv3)
Natural Language
- English
Programming Language
- Python :: 2.6
- Python :: 2.7

Project description

https://anaconda.org/kbchoi/emase/badges/version.svg

https://travis-ci.org/churchill-lab/emase.png?branch=master

EMASE: Expectation-Maximization algorithm for Allele Specific Expression

Narayanan Raghupathy, Kwangbom Choi, Steve Munger, and Gary Churchill

Free software: GNU General Public License v3 (GPLv3)
Documentation: https://emase.readthedocs.org.

Note: The documentation for EMASE is still under work.

What is EMASE?

EMASE is a software program written in Python to quantify allele-specific expression and gene expression simultaneously from RNA-seq data. EMASE takes in the diploid transcriptome alignment BAM file and GTF file as inputs and estimates expression abundance for each isoforms and each alleles using Expectation Maxmization algorithm.

Why Use EMASE?

Current RNA-seq analysis pipeline employ two steps to quantify gene expression and allele-specific expression (ASE); gene expression is estimated from all read alignments, while ASE is assessed separately by using only reads that overlap known SNP locations.

Large-scale genome sequencing efforts have characterized millions of genetic variants across in human and model organisms. However development of tools that can effectively utilize this individual/strain-specific variation to inform quantitation of gene expression abundance have lagged behind.

EMASE, together with g2gtools (https://github.com/churchill-lab/g2gtools), offers an integrated solution to utilize known genetic variations in quantifying expression abundances at allele and gene/isoform level.

In F1 hybrids from model organisms, EMASE allows us to utilize parental strain-specific genetic variation in RNA-seq analysis to quantify gene expression and allele-specific expression (ASE) simultaneously

In humans, EMASE allows us to utilize the individual’s genetic variation in doing personalized RNA-seq analysis and quantify gene expression and allele-specific expression (ASE) simultaneously

Briefly, EMASE: EM for allele-specific expression, uses individualized diploid genomes/transcriptomes adjusted for known genetic variations and quantifies allele-specific gene expression and total gene expressionsimultaneously. The EM algorithm employed in EMASE models multi-reads at the level of gene, isoform, and allele and apportions them probabilistically.

Earlier, we developed Seqnature to utilize known genetic variations, bot SNPs and Indels, to build indivdualized genomes and adjust annotations. One can use Seqnature to create individualized diploid transcritpme and align RNA-seq reads simultaneously to the diploid transcriptome and get alignment file in BAM format. This diploid BAM file can be used as input to EMASE.

Applications

Allele-specific gene expression in F1 Hybrids from model organisms

If we have F1 hybrids with parental genetic variants information, one can use Seqnature to build strain specific genomes and extract diploid transcriptome. RNA-seq alignment bam file obtained by aligbning RNA-seq reads to the diploid transcriptome is used as input for EMASE.

Personalized ASE analysis in Human

EMASE can be used to do personalized RNA-seq analysis in human. For this, we use the phased genetic variation (SNP and Indel) information to construct personalized diploid genome and align reads to the diploid transcriptome..

Allele-specific Binding using Chip-Seq in F1 Hybrids

Although we explained the use of EMASE to quantify Allele-Specific Expression from RNA-seq data, the tool can be used with other types of sequencing data. We have successfully used EMASE to quantify allele-specific binding from ChIP-seq data. While useing ChIP-seq, one needs to use diploid binding target sequences instead of diploid transcriptome for alignment target sequences.

Mining Diploid alignments and alignment probabilities

EMASE can be used to glean more information from the alignment, in addition to running EMASE and obtaining effective read counts for each allele and isoform. For example, we can use EMASE’s count-alignment program to obtain unique reads at allele-level for every gene, gene unique reads but allele-level multireads, and the total number of reads aligned to every gene. Having these alignment statistics at for every isoform and gene can be useful in interpreting expression estimates from EMASE.

References

EMASE: Expectation-Maximization algorithm for Allele Specific Expression, Narayanan Raghupathy, Kwangbom Choi, Steve Munger, and Gary Churchill, Manuscript in preparation.
[RNA-Seq Alignment to Individualized Genomes Improves Transcript Abundance Estimates in Multiparent Populations](http://www.genetics.org/content/198/1/59.short) Steven C. Munger, Narayanan Raghupathy,Kwangbom Choi, Allen K. Simons, Daniel M. Gatti, Douglas A. Hinerfeld, Karen L. Svenson, Mark P. Keller, Alan D. Attie, Matthew A. Hibbs, Joel H. Graber, Elissa J. Chesler and Gary A. Churchill. Genetics. 2014 Sep;198(1):59-73. doi: 10.1534/genetics.114.165886.
[PRDM9 Drives Evolutionary Erosion of Hotspots in Mus musculus through Haplotype-Specific Initiation of Meiotic Recombination](http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1004916) Christopher L. Baker, Shimpei Kajita, Michael Walker, Ruth L. Saxl, Narayanan Raghupathy, Kwangbom Choi, Petko M. Petkov, Kenneth Paigen PLOS Genetics: published 08 Jan 2015 | info:doi/10.1371/journal.pgen.1004916

History

0.10.16 (05-10-2016)

Modified prepare-emase so it can process the newest Ensembl gene annotation (Release 84)
The script prepare-emase can process gzipped files
Updated documentation

0.10.15 (05-04-2016)

Uploaded to Anaconda.org
Updated documentation

0.10.14 (04-25-2016)

Added option to not having rname when loading/saving AlignmentPropertyMatrix
Documentation updated to reflect recent changes (e.g., processing paired-end data etc.)

0.10.12 (04-22-2016)

run-emase report file names changed (effective -> expected)
run-emase report file can have notes

0.10.11 (02-15-2016)

Minor change in documentation

0.10.9 (02-09-2016)

Fixed readthedocs compiling fails

0.10.5 (01-20-2016)

Added pull_alignments_from method in AlignmentPropertyMatrix class
Added a script pull-out-unique-reads that unsets emase pseudo-alignments that are not uniquely aligning

0.10.3 (01-06-2016)

Fixed a bug in run-emase on handling inbred (reference or one haplotype) alignments

0.10.2 (01-04-2016)

Added get-common-alignments: To compute intersection between each of paired ends

0.9.10 (01-04-2016)

AlignmentMatrixFactory can handle unmapped reads

0.9.8 (07-31-2015)

Fixed a bug in simulate-reads: No more duplicate read ID’s

0.9.7 (07-28-2015)

Added create-hybrid: Build hybrid target directly using custom transcripts
Added simulate-reads: Four nested models

0.9.6 (06-02-2015)

AlignmentPropertyMatrix can represent an equivalence class
Fixed a bug in length normalization
Swapped Model ID’s between 1 and 2
- Model 1: Gene->Allele->Isoform (*)
- Model 2: Gene->Isoform->Allele (*)
- Model 3: Gene->Isoform*Allele
- Model 4: Gene*Isoform*Allele (RSEM model)

0.9.5 (05-17-2015)

Fixed length normalization: Depth = Count / (Transcript_Length - Read_Length + 1)

0.9.4 (02-23-2015)

Fixed a bug in prepare-emase

0.9.3 (02-22-2015)

Fixed a bug in Model 2 of handling multireads
run-emase checks absolute sum of error (in TPM) for termination

0.9.2 (02-17-2015)

Added three more models of handling multireads
- Model 1: Gene->Isoform->Allele
- Model 2: Gene->Allele->Isoform
- Model 3: Gene->Isoform*Allele
- Model 4: Gene*Isoform*Allele (RSEM model)

0.9.0 (01-31-2015)

First release on PyPI
Only implements RSEM model for handling Multiply-mapping reads (or multireads)

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: GNU General Public License v3 (GPLv3)
Natural Language
- English
Programming Language
- Python :: 2.6
- Python :: 2.7

Release history Release notifications | RSS feed

This version

0.10.16

May 10, 2016

0.10.15

May 4, 2016

0.10.14

Apr 26, 2016

0.10.13

Apr 26, 2016

0.10.12

Apr 23, 2016

0.10.11

Feb 15, 2016

0.10.10

Feb 15, 2016

0.10.9

Feb 9, 2016

0.10.5

Jan 20, 2016

0.10.3

Jan 6, 2016

0.10.2

Jan 5, 2016

0.9.8

Jul 31, 2015

0.9.7

Jul 29, 2015

0.9.6

Jun 2, 2015

0.9.5

May 16, 2015

0.9.4

Feb 23, 2015

0.9.3

Feb 23, 2015

0.9.2

Feb 17, 2015

0.9.1

Feb 16, 2015

0.9.0

Feb 2, 2015

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

emase-0.10.16.tar.gz (52.8 kB view details)

Uploaded May 10, 2016 Source

File details

Details for the file emase-0.10.16.tar.gz.

File metadata

Download URL: emase-0.10.16.tar.gz
Upload date: May 10, 2016
Size: 52.8 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for emase-0.10.16.tar.gz
Algorithm	Hash digest
SHA256	`e9b98f506012428154163c0c39855b181827ae50d314f0c108d151cf62760d6c`
MD5	`f0b72500d17288f6679d799d1cc254d8`
BLAKE2b-256	`09aac7059aa12532faa8a6fe7734ea8d8ff9276f179a57bc32a7b01180fd41e5`

See more details on using hashes here.

emase 0.10.16

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

EMASE: Expectation-Maximization algorithm for Allele Specific Expression

What is EMASE?

Why Use EMASE?

Applications

Allele-specific gene expression in F1 Hybrids from model organisms

Personalized ASE analysis in Human

Allele-specific Binding using Chip-Seq in F1 Hybrids

Mining Diploid alignments and alignment probabilities

References

History

0.10.16 (05-10-2016)

0.10.15 (05-04-2016)

0.10.14 (04-25-2016)

0.10.12 (04-22-2016)

0.10.11 (02-15-2016)

0.10.9 (02-09-2016)

0.10.5 (01-20-2016)

0.10.3 (01-06-2016)

0.10.2 (01-04-2016)

0.9.10 (01-04-2016)

0.9.8 (07-31-2015)

0.9.7 (07-28-2015)

0.9.6 (06-02-2015)

0.9.5 (05-17-2015)

0.9.4 (02-23-2015)

0.9.3 (02-22-2015)

0.9.2 (02-17-2015)

0.9.0 (01-31-2015)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes