Skip to main content

Python package for testing strandedness of RNA-Seq fastq files

Project description

https://img.shields.io/pypi/v/how_are_we_stranded_here.svg

Python package for testing strandedness of RNA-Seq fastq files

Ever get RNA-Seq data where the library prep or strandedness has been omitted in the methods?

This should save some headaches later in your pipeline and analysis when you realise you’ve used the wrong strandedness setting (RF/fr-firststrand, FR/fr-secondstrand, unstranded)

Requirements

how_are_we_stranded_here requires the following packages be installed:

kallisto == 0.44.x

python >= 3.6.0

RSeQC

It also requires a transcriptome annotation (.fasta file - e.g. ensembl’s .cdna.fasta, or a prebuilt kallisto index), and a corresponding gtf.

Sometimes pseudoalignments will not work with newer versions of kallisto. If this is an issue, we suggest downgrading to 0.44.0.

Installation

pip install how_are_we_stranded_here

Usage

For basic usage, run check_strandedness with a gtf transcript annotation, transcripts fasta file and fastq read files from one sample.

check_strandedness --gtf Yeast.gtf --transcripts Yeast_cdna.fasta --reads_1 Sample_A_1.fq.gz --reads_2 Sample_A_2.fq.gz

Output

check_strandedness will print to console the results of infer_experiment.py (http://rseqc.sourceforge.net/#infer-experiment-py), along with an interpretation.

checking strandedness
Reading reference gene model stranded_test_WT_yeast_rep1_1_val_1_1/Saccharomyces_cerevisiae.R64-1-1.98.bed ... Done
Loading SAM/BAM file ...  Total 20000 usable reads were sampled
This is PairEnd Data
Fraction of reads failed to determine: 0.0595
Fraction of reads explained by "1++,1--,2+-,2-+": 0.0073 (0.8% of explainable reads)
Fraction of reads explained by "1+-,1-+,2++,2--": 0.9332 (99.2% of explainable reads)
Over 90% of reads explained by "1+-,1-+,2++,2--"
Data is likely RF/fr-firststrand

Any intermediate files are written to a folder in your current working directory derived from the name of the reads_1 file.

How it Works

check_strandedness.py runs a series of commands to check which direction reads align once mapped in transcripts.

It first creates a kallisto index (or uses a pre-made index) of your organisms transcriptome.

It then maps a small subset of reads (default 200000) to the transcriptome, and uses kallisto’s –genomebam argument to project pseudoalignments to genome sorted BAM file.

It finally runs RSeQC’s infer_experiment.py to check which direction reads from the first and second pairs are aligned in relation to the transcript strand, and provides output with the likely strandedness of your data.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

how_are_we_stranded_here-1.0.1.tar.gz (32.2 kB view details)

Uploaded Source

Built Distribution

how_are_we_stranded_here-1.0.1-py3-none-any.whl (11.3 kB view details)

Uploaded Python 3

File details

Details for the file how_are_we_stranded_here-1.0.1.tar.gz.

File metadata

  • Download URL: how_are_we_stranded_here-1.0.1.tar.gz
  • Upload date:
  • Size: 32.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.6.0.post20210108 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.2

File hashes

Hashes for how_are_we_stranded_here-1.0.1.tar.gz
Algorithm Hash digest
SHA256 5db2fde61409e1c37ef65b9065c3935c5a462130c939e64c810dc022f47f559a
MD5 a3456c87409e9d1b388df02919353873
BLAKE2b-256 7b69779749cdcc8f059b6f578849a0f4f13c362c236f880feab57d6930638a0e

See more details on using hashes here.

File details

Details for the file how_are_we_stranded_here-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: how_are_we_stranded_here-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 11.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.6.0.post20210108 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.9.2

File hashes

Hashes for how_are_we_stranded_here-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8e09d80f3c849f5a93ac8f516beec376373f2e9135d7fbe146ef95bca777c221
MD5 bdeac091f9eeadf1fa455a6359425407
BLAKE2b-256 12b5483a02769e127eba72873cf537e5673841c93a98ac75f4ca38f843353c03

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page