Skip to main content

Papolarity is a tool to analyze polarity of transcriptomic alignments such as Ribo-seq and RNA-seq.

Project description

Papolarity

Papolarity is a Python package for analysis of transcript-level short read coverage profiles.

For a single sample, for each transcript papolarity allows for computing the classic polarity metric which, in the case of Ribo-Seq, reflects ribosome positional preferences.

For comparison versus a control sample, papolarity estimates an improved metric, the relative linear regression slope of coverage along transcript length. This involves de-noising by profile segmentation with a Poisson model (using pasio), and aggregation of Ribo-Seq coverage within segments, thus achieving reliable estimates of the regression slope.

Publication: Assessing Ribosome Distribution Along Transcripts with Polarity Scores and Regression Slope Estimates. Methods Mol Biol. 2021;2252:269-294. doi:10.1007/978-1-0716-1150-0_13

papolarity logo

Toolkit

Papolarity provide a toolkit to perform different tasks necessary for processing transcriptomic data such as Ribo-Seq alignments.

Installation:

python -m pip install papolarity

There're good chances that you'd also install pasio: python -m pip install pasio

The package is organized as a single entry point for a set of subcommands.

You can run it with one of these commands:

  • papolarity [arguments]

  • python -m papolarity [arguments] - if you need to specify a certain version of python to run a package.

Python 3.7+ is supported; probably this restriction will be relaxed later.

There are no conventions about a structure of folders and file names. All files that are used by tool are always specified in command line arguments.

Papolarity have a few conventions about file extensions: all files with .gz extension are treated as gzip archives. Input files with names ending with .gz will be automatically unpacked, output files will be automatically packed. Character - instead of filename will be treated as stdin or stdout. It can be useful to use papolarity in pipelined commands.

You can follow the protocol to get the idea how these tools are supposed to be used. If you need to customize pipelines, please reference to help for corresponding tools: papolarity --help lists all available tools. papolarity <cmd> --help shows description of all arguments and options for a specified tool.

Protocol

In our paper "Assessing Ribosome Distribution Along Transcripts with Polarity Scores and Regression Slope Estimates" (doi:10.1007/978-1-0716-1150-0_13) we describe a protocol for Ribo-Seq analysis. In a file protocol-paper.sh you can find a script we used in a paper to process our datasets. It's slightly modified for better readability compared to a paper, and is more easily customizable. Also it has a few additional commands to generate plots which are absent in paper. Steps are named after paper sections.

You can use this protocol as is or change any parts you wish. As long as you comply with data formats and use consistent data (e.g. all files should be clipped in the same manner, or non-clipped at all), papolarity will work, order of commands, folder names, filenames and so on doesn't matter.

To run this pipeline, you should have several auxiliary tools installed: csvtk, GNU parallel, and python package pasio.

Important notes

  • (!!!) It's VERY important to align reads onto a transcriptome, not onto a genome. Please, double-check type of your alignment in case of problems.
  • Version 1.1.0 had a bug which caused incorrect results for transcripts
  • In the originally published protocol we recommended mapping the reads with STAR and then keeping only uniquely mapped reads by MAPQ filtering with samtools. This approach might be too stringent and even problematic as many reads initially mapped uniquely to the genome become multi-mappers in the alignment to the transcriptome (as there are often several overlapping transcripts per gene present in the transcript annotation). The updated version of the protocol solves this issue by requiring unique read mapping at the initial alignment step (see protocol-paper-obtain-data.sh script) w/o additional post-filtering. Other read mapping strategies (including keeping some multi-maps) might be also applicable in particular scenarios.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

papolarity-1.1.2.tar.gz (350.7 kB view details)

Uploaded Source

Built Distribution

papolarity-1.1.2-py2.py3-none-any.whl (36.6 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file papolarity-1.1.2.tar.gz.

File metadata

  • Download URL: papolarity-1.1.2.tar.gz
  • Upload date:
  • Size: 350.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.7.1 requests/2.23.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.1

File hashes

Hashes for papolarity-1.1.2.tar.gz
Algorithm Hash digest
SHA256 8adf79109e2555ac8628c89e731e6dd63a9eb9c0389753992b677c7846c018d4
MD5 ed220d636e4cfa2c736b5f1998db7801
BLAKE2b-256 a2140e7ae53eab6d26249fff1f17e26b034befc67c640d4714b780d769027c15

See more details on using hashes here.

File details

Details for the file papolarity-1.1.2-py2.py3-none-any.whl.

File metadata

  • Download URL: papolarity-1.1.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 36.6 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.7.1 requests/2.23.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.1

File hashes

Hashes for papolarity-1.1.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 2a53298b6c1eeb20c691882a687f01342e76f9a555b9a000fddebcc7cba000a5
MD5 2e5b94944c59a43fd4ad865519c6eefb
BLAKE2b-256 cf16c0204a52e9ae1183a66cf0ac85b9fad2b71e82433007c153635745d91a13

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page