This is a pre-production deployment of Warehouse, however changes made here WILL affect the production instance of PyPI.
Latest Version Dependencies status unknown Test status unknown Test coverage unknown
Project Description

Outrigger

Outrigger is a program which uses junction reads from RNA seq data, and a graph database to create a de novo alternative splicing annotation with a graph database, and quantify percent spliced-in (Psi) of the events.

  • Free software: BSD license

Features

  • Finds novel splicing events, including novel exons! (outrigger index) from .bam files
  • (optional) Validates that exons have correct splice sites, e.g. GT/AG and AT/AC for mammalian systems (outrigger validate)
  • Calculate “percent spliced-in” (Psi/Ψ) scores for all your samples given the validated events (or the original events if you opted not to validate)

Installation

To install outrigger, we recommend using the Anaconda Python Distribution and creating an environment.

You’ll want to add the `bioconda <https://bioconda.github.io/>`__ channel to make installing `bedtools <bedtools.readthedocs.io>`__ and its Python wrapper, `pybedtools <https://daler.github.io/pybedtools/>`__ easy.

conda config --add channels r
conda config --add channels bioconda

Create an environment called outrigger-env. Python 2.7, Python 3.4, and Python 3.5 are supported.

conda create -n outrigger-env pandas pybedtools gffutils biopython bedtools joblib

Now activate that environment using source activate outrigger-env and install outrigger from PyPI, using pip:

source activate outrigger-env
pip install outrigger

To check that it installed properly, try the command with the help option (-h), outrigger -h. The output should look like this:

$ outrigger -h
usage: outrigger [-h] {index,validate,psi} ...

Calculate "percent-spliced in" (Psi) scores of alternative splicing on a *de
novo*, custom-built splicing index

positional arguments:
  {index,validate,psi}  Sub-commands
    index               Build an index of splicing events using a graph
                        database on your junction reads and an annotation
    validate            Ensure that the splicing events found all have the
                        correct splice sites
    psi                 Calculate "percent spliced-in" (Psi) values using the
                        splicing event index built with "outrigger index"

optional arguments:
  -h, --help            show this help message and exit

Bleeding edge code from Github (here)

For advanced users, if you have git and Anaconda Python installed, you can:

  1. Clone this repository
  2. Change into that directory
  3. Create an environment with the necessary packages from Anaconda
  4. Activate the environment
  5. Install remaining packages from PyPI (`graphlite <https://github.com/eugene-eeo/graphlite>`__ is only available on PyPI, not as a conda package)
  6. Install this package

These steps are shown in code below.

git clone git@github.com:YeoLab/outrigger
cd outrigger
conda create --name outrigger --yes --file conda_requirements.txt --channel bioconda
source activate outrigger
pip install -r requirements.txt
pip install .

Quick start

If you just want to know how to run this on your data with the default parameters, start here. Let’s say you performed your alignment in the folder called ~/projects/tasic2016/analysis/tasic2016_v1, and that’s where your SJ.out.tab files from the STAR aligner are (they’re output into the same folder as the .bam files). First you’ll need to change directories to that folder with cd.

cd ~/projects/tasic2016/analysis/tasic2016_v1

Then you need find all alternative splicing events, which you do by running outrigger index on the splice junction files and the gtf. Here is an example command:

outrigger index --sj-out-tab *SJ.out.tab \
    --gtf /projects/ps-yeolab/genomes/mm10/gencode/m10/gencode.vM10.annotation.gtf

Next, you’ll want to validate that the splicing events you found follow biological rules, such as being containing GT/AG (mammalian major spliceosome) or AT/AC (mammalian minor splicesome) sequences. To do that, you’ll need to provide the genome name (e.g. mm10) and the genome sequences. An example command is below:

outrigger validate --genome mm10 \
    --fasta /projects/ps-yeolab/genomes/mm10/GRCm38.primary_assembly.genome.fa

Finally, you can calculate percent spliced in (Psi) of your splicing events! Thankfully this is very easy:

outrigger psi

It should be noted that ALL of these commands should be performed in the same directory, so no moving.

Quick start summary

Here is a summary the commands in the order you would use them for outrigger!

cd ~/projects/tasic2016/analysis/tasic2016_v1
outrigger index --sj-out-tab *SJ.out.tab \
    --gtf /projects/ps-yeolab/genomes/mm10/gencode/m10/gencode.vM10.annotation.gtf
outrigger validate --genome mm10 \
    --fasta /projects/ps-yeolab/genomes/mm10/GRCm38.primary_assembly.genome.fa
outrigger psi

This will create a folder called outrigger_output, which at the end should look like this:

$ tree outrigger_output
outrigger_output
├── index
│   ├── gtf
│   │   ├── gencode.vM10.annotation.gtf
│   │   ├── gencode.vM10.annotation.gtf.db
│   │   └── novel_exons.gtf
│   ├── junction_exon_direction_triples.csv
│   ├── mxe
│   │   ├── events.csv
│   │   ├── exon1.bed
│   │   ├── exon2.bed
│   │   ├── exon3.bed
│   │   ├── exon4.bed
│   │   ├── splice_sites.csv
│   │   └── validated
│   │       └── events.csv
│   └── se
│       ├── events.csv
│       ├── exon1.bed
│       ├── exon2.bed
│       ├── exon3.bed
│       ├── splice_sites.csv
│       └── validated
│           └── events.csv
├── junctions
│   ├── metadata.csv
│   └── reads.csv
└── psi
    ├── mxe
    │   └── psi.csv
    ├── outrigger_psi.csv
    └── se
        └── psi.csv

10 directories, 22 files

For Developers

How to run with the Python debugger

How to run the code with the Python debugger. To run the command line functions such that when they break, you jump into the pdb (Python debugger), here is the code:

python -m pdb outrigger/commandline.py index \
--sj-out-tab outrigger/test_data/tasic2016/unprocessed/sj_out_tab/* \
    --gtf outrigger/test_data/tasic2016/unprocessed/gtf/gencode.vM10.annotation.snap25.myl6.gtf

Notice that you replace outrigger with python -m pdb outrigger/commandline.py, which is relative to this github directory.

How to run the tests

If you want to run the tests without calculating what percentage of lines are covered in the test suite, run

make test

If you want to run the tests and see which lines are covered by tests and get an overall percentage of test coverage, run

make coverage

If you want to run an example with ENSEMBL GTF files, do:

make arabdopsis

By default, Travis-CI does all three:

script:
- make coverage
- make lint
- make arabdopsis

History

v0.2.9 (November 11th, 2016)

This is a non-breaking release with many speed improvements, and upgrade is recommended.

v0.2.9 New features

  • Add bam alignment files as input option

Miscellaneous

  • Parallelized Psi calculation, the exact number of processors can be specified with --n-jobs, and by default, --n-jobs is -1, which means use as many processors as are available.

v0.2.8 (October 23rd, 2016)

Updated README/HISTORY files

v0.2.7 (October 23rd, 2016)

v0.2.7 New features

  • Added outrigger validate command to check for canonical splice sites by default: GT/AG (U1, major spliceosome) and AT/AC (U12, minor spliceosome). Both of these are user-adjustable as they are only the standard for mammalian genomes.

v0.2.7 API changes

  • Added --resume and --force options to outrigger index to prevent the overwriting of interrupted indexing operations, or to force overwriting. By default, outrigger complains and cowardly exits.

v0.2.7 Bug fixes

  • Support ENSEMBL gtf files which specify chromsome names with a number, e.g. 4 instead of chr4. Thank you to lcscs12345 for pointing this out!

v0.2.7 Miscellaneous

  • Added version info with outrigger --version
  • Sped up gffutils queries and event finding by running ANALYZE on SQLite databases.

v0.2.6 (September 15th, 2016)

This is a non-breaking patch release

v0.2.6 Bug fixes

  • Wasn’t concatenating exons properly after parallelizing

v0.2.6 Miscellaneous

  • Clarified .gtf file example for directory output

v0.2.5 (September 14th, 2016)

v0.2.5 Bug fixes

  • Added joblib to requirements

v0.2.4 (September 14th, 2016)

This is a non-breaking patch release of outrigger.

v0.2.4 New features

  • Actually parallelized exon finding for novel exons. Before had written the code and tested the non-parallelized version but now using actually parallelized version!

v0.2.4 Bug fixes

  • Don’t need to turn on --debug command for outrigger to even run

v0.2.3 (September 13th, 2016)

This is a patch release of outrigger, with non-breaking changes from the previous one.

Bug fixes

  • Subfolders get copied when installing
  • Add test for checking that outrigger -h command works

v0.2.2 (September 12th, 2016)

This is a point release which includes the index submodule in the __all__ statement.

v0.2.1 (September 12th, 2016)

This is a point release which actually includes the requirements.txt file that specifies which packages outrigger depends on.

v0.2.0 (September 9th, 2016)

This is the second release of outrigger!

New features

  • Parallelized exon finding for novel exons
  • Added outrigger validate command to check that your new exons have proper splice sites (e.g. GT/AG and AT/AC)
  • Added more test data for other event types (even though we don’t detect them yet)

v0.1.0 (May 25, 2016)

This is the initial release of outrigger

Release History

Release History

0.2.9

This version

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.2.8

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.2.6

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.2.5

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.2.3

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.2.2

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.2.1

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.2.0

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.1.0

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

Download Files

Download Files

TODO: Brief introduction on what you do with files - including link to relevant help section.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
outrigger-0.2.9.tar.gz (49.1 kB) Copy SHA256 Checksum SHA256 Source Nov 11, 2016

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS HPE HPE Development Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting