Molecular Outlier DEtection from Rna-seq data
Project description
Introduction
MODER(Molecular Outlier DEtection from Rna sequencing assays) is a comprehensive and user-friendly toolkit to detect aberrant gene expression, alternative splicing, and allele specific expression from multiple samples. MODER is built on python3 and easy to use. Users only need to provide a list of bam files, and MODER will do all complicated, error-prone processing automatically and return all three kinds of outliers (gene~sample pairs).
Framework
Documentation
Documentation can be found on here
Dependency
bioinfomatics software
If you have installed conda, you can easily install samtools and bcftools by following command.
conda install -c bioconda samtools
conda install -c bioconda bedtools
conda install -c bioconda bcftools
If your are working with Debian-based linux system, it's convenient for you to install samtools and bctools by package manager -- apt
sudo apt install samtools
sudo apt install bedtools
sudo apt install bcftools
python package
Installation
For install MODER, you can use git to pull down all code to your linux system. Make sure samtools, bcttools and all dependency third-party python libraries has been installed, then you call use it easily by a python script named moder.py. Look for Usgae to get more information about how to use this program.
git clone -b singleTissue https://github.com/Xu-Dong/mOutlierPipe.git
Usage
mode argument
option | description |
---|---|
--expression | assign mode to analysis Gene Expression data |
--splicing | assign mode to analysis Splicing data |
--ase | assign mode to analysis ASE data |
we provide three arguments to decide which analysis pipeline will be run, and all three analysis pipeline will be run if you don't provide any option of these, :
look module1 for more information of expression pipeline.
look module2 for more information of splicing pipeline.
look module3 for more information of ase pipeline.
basic argument
option | description |
---|---|
-i , --input | txt file with all input bam file path (required) |
--gtf | genome annotation file of GTF format (required) |
-o , --output | directory to store all resulting files <font color='red'>(optional and default output dir is current directory)</font> |
-p , --parallel | parallel number <font color='red'>(optional and default value is 1)</font> |
--threshold | threshold of z_score, used to get outliers which abs value larger than threshold defined by this arguments <font color='red'>(optional and default value is 2)</font> |
more arguments and their usage, you can refer to featureCounts, peer, leafcutter, SPOT, gtfToGenePred and genePredToBed
you can run all these pipeline by command as follow:
python moder.py -p 8
--input file_path.txt
--gtf genome_annotation.gtf
--vcf example.vcf.gz
--variation Vg_GTEx_v8.txt
--tissue MSCLSK
--threshold 2
module1: Expression Data Analysis
This module is designed to analysis gene expression data. The basic command line arguments and descriptions as follows. More available parameters refer to RNA-SeQC and PEER
command line arguments
option | description |
---|---|
--expression | assign mode to analysis Gene Expression data |
-i , --input | txt file with all input bam file path (required) |
--gtf | genome annotation file in GTF format (required) |
-o , --output | directory to store all resulting files <font color='red'>(optional and default output dir is current directory)</font> |
-p , --parallel | parallel number <font color='red'>(optional and defalut value is 1)</font> |
--threshold | threshold of z_score, used to filter results' value larger than threshold <font color='red'>(optional and default value is 2)</font> |
running example
python mOutlierPipe.py --expression
--parallel 8
--input file_path.txt
--gtf sample_annotation.gtf
--threshold 2
module2: Splicing Data Analysis
This module is designed to analysis splicing data. The basic command line arguments and descriptions as follows. More available parameters refer to leafcutter, SPOT and PEER
command line arguments
option | description |
---|---|
--splicing | assign mode to analysis Splicing data |
-i , --input | txt file with all input bam file path (required) |
--gtf | genome annotation file in GTF format, used to translate cluster id to gene id (required) |
-o , --output | directory to store all resulting files <font color='red'>(optional and default output dir is current directory)</font> |
-p , --parallel | parallel number <font color='red'>(optional and default value is 1)<font> |
--threshold | threshold of z_score, in splicing analysis pipeline, the value of z will be translated to p <font color='red'>(optional and default value is 0.0027)<font> |
running example
python mOutlierPipe.py --splicing
--parallel 8
--input file_path.txt
--gtf genome_annotation.gtf
--threshold 2
module3: Allele Specific Expression Analysis
This module is designed to analysis allele specific expression data. The basic command line arguments and descriptions as follows. More available parameters refer to phASER
command line arguments
option | description |
---|---|
--ase | assign mode to analysis ASE data |
-i , --input | txt file with all input bam file path (required) |
--gtf | genome annotation file in GTF format, used to translate cluster id to gene id (required) |
--vcf | Variant Call Format file, include variation information about the genome (required) |
--variant | tissue-specific estimates of genetic variation in gene dosage (required) |
-o , --output | directory to store all resulting files <font color='red'>(optional and default output dir is current directory)</font> |
-p , --parallel | parallel number <font color='red'>(optional and default value is 1)<font> |
--threshold | threshold of z_score, in ase analysis pipeline, the value of z will be translated to p <font color='red'>(optional and default value is 0.0027)<font> |
running example
python mOutlierPipe.py --ase
--parallel 8
--input file_path.txt
--gtf genome_annotation.gtf
--vcf sample.vcf
--variant Vg_GTEx_v8.txt
--threshold 2
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.