Skip to main content

Splice aligner of long transcriptomic reads to genome.

Project description

uLTRA

uLTRA is a tool for splice alignment of long transcriptomic reads to a genome, guided by a database of exon annotations. uLTRA takes reads in fast(a/q) and a genome annotation as input and outputs a SAM-file. The SAM-file includes information on which splice sites are found and if the read is a full splice match (and to which transcript), incomplete splice match, Novel in catalog, or novel not in the catalog, as defined in SQANTI. uLTRA is highly accurate when aligning to small exons see some examples.

uLTRA is distributed as a python package supported on Linux / OSX with python v>=3.4. Build Status.

Table of Contents

INSTALLATION

Using conda

Conda is the preferred way to install uLTRA.

  1. Create and activate a new environment called ultra
conda create -n ultra python=3 pip 
source activate ultra
  1. Install uLTRA
pip install ultra_bioinformatics
  1. You should now have 'uLTRA' installed; try it:
uLTRA --help
  1. Install slaMEM
git clone git@github.com:fjdf/slaMEM.git
cd slaMEM
make 

And either place the generated binary slaMEMin your path or run export PATH=$PATH:$PWD/ if you are in the slaMEM folder).

Upon start/login to your server/computer you need to activate the conda environment "ultra" to run uLTRA as:

source activate ultra

Downloading source from GitHub

Dependencies

Make sure the below-listed dependencies are installed (installation links below). Versions in parenthesis are suggested as uLTRA has not been tested with earlier versions of these libraries. However, uLTRA may also work with earlier versions of these libraries.

With these dependencies installed. Run

git clone https://github.com/ksahlin/uLTRA.git
cd uLTRA
./uLTRA

USAGE

uLTRA can be used with either Iso-Seq or ONT reads.

Indexing

First, we construct the data structures used in uLTRA using a genome annotation GTF file and a genome fasta file.

uLTRA prep_splicing  all_genes.gtf outfolder/  [parameters]
uLTRA prep_seqs  genome.fasta  outfolder/  [parameters]

Aligning

For example, to align ONT cDNA reads using 48 cores, run

uLTRA align  genome.fasta  reads.[fa/fq] outfolder/  --ont --t 48   # ONT cDNA reads using 48 cores
uLTRA align  genome.fasta  reads.[fa/fq] outfolder/  --isoseq --t 48 # PacBio isoseq reads
uLTRA align  genome.fasta  reads.[fa/fq] outfolder/  --k 14  --t 48 # PacBio dRNA reads or reads with >10-12% error rate

Pipeline

Performs all the steps in one

uLTRA pipeline test/SIRV_genes_C_170612a.gtf  test/SIRV_genes.fasta  test/reads.fa outfolder/  [parameters]

Output

uLTRA outputs a SAM-file with alignments to the genome. In addition, it outputs to extra tags describing whether all the splices sites are known and annotated (FSM), new splice combinations (NIC), etc. For details see the definitions of notations in the Sqanti paper.

CREDITS

Please cite [1] when using uLTRA.

  1. Kristoffer Sahlin, Veli Makinen (2019) "Accurate spliced alignment of long RNA sequencing reads". (In preparation)

Bib record:

LICENCE

GPL v3.0, see LICENSE.txt.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ultra_bioinformatics-0.0.1.tar.gz (46.1 kB view details)

Uploaded Source

File details

Details for the file ultra_bioinformatics-0.0.1.tar.gz.

File metadata

  • Download URL: ultra_bioinformatics-0.0.1.tar.gz
  • Upload date:
  • Size: 46.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.18.4 setuptools/39.1.0 requests-toolbelt/0.8.0 tqdm/4.43.0 CPython/3.6.5

File hashes

Hashes for ultra_bioinformatics-0.0.1.tar.gz
Algorithm Hash digest
SHA256 a4e61a998acdde447ba08aa6e43845ac44e27466e38032739c016a6f29972b2a
MD5 c4a9823bc92e259e4c299d12a8ae9fd4
BLAKE2b-256 652965df1f02482f26455e91116a925a37bc462f3661f2974c1876d0b33d8c08

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page