Skip to main content

Annotation and segmentation of MAS-seq data

Project description

GitHub release Generic badge PyPI version maslongbow

Longbow is a command line tool to process MAS-ISO-seq data. Longbow employs a generative modelling approach to accurately annotate and segment MAS-ISO-seq’s concatenated full-length transcript isoforms from single-cell or bulk long read RNA sequencing libraries.

Documentation for all longbow commands can be found on the Longbow documentation page.

Installation

pip is recommended for Longbow installation.

pip install maslongbow

For a pre-built version including all dependencies, access our Docker image.

docker pull us.gcr.io/broad-dsp-lrma/lr-longbow:0.5.37

To install from Github source for development, the following commands can be run.

git clone https://github.com/broadinstitute/longbow.git
pip install -e longbow/

Getting Started

The commands below illustrate the Longbow workflow on a small library of SIRVs (Spike-in RNA Variant Control Mixes). MAS-ISO-seq concatenated transcripts are annotated, segmented, and filtered using the mas15 model. A number of statistics and QC images are generated along the way. Final filtered transcripts can then be aligned using standard splice-aware long read mappers (e.g. minimap2). More detail for each command can be found in the full documentation.

# Download a tiny test dataset (less than 300K)
wget https://github.com/broadinstitute/longbow/raw/main/tests/test_data/mas15_test_input.bam
wget https://github.com/broadinstitute/longbow/raw/main/tests/test_data/mas15_test_input.bam.pbi
wget https://github.com/broadinstitute/longbow/raw/main/tests/test_data/resources/SIRV_Library.fasta

# Basic processing workflow
longbow annotate -m mas15v2 mas15_test_input.bam | \  # Annotate reads according to the mas15v2 model
  tee ann.bam | \                                     # Save annotated BAM for later
  longbow filter | \                                  # Filter out improperly-constructed arrays
  longbow segment | \                                 # Segment reads according to the model
  longbow extract -o filter_passed.bam                # Extract adapter-free cDNA sequences

# Align reads with long read aligner (e.g. minimap2, pbmm2)
samtools fastq filter_passed.bam | \
    minimap2 -ayYL --MD -x splice:hq SIRV_Library.fasta - | \
    samtools sort > align.bam &&
    samtools index align.bam

Getting help

The Longbow documentation page provides detailed descriptions of command line options and algorithmic details. If you encounter bugs or have questions/comments/concerns, please file an issue on our Github page.

Developers’ guide

For information on contributing to Longbow development, visit our developer documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

maslongbow-0.5.37.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

maslongbow-0.5.37-py2.py3-none-any.whl (100.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file maslongbow-0.5.37.tar.gz.

File metadata

  • Download URL: maslongbow-0.5.37.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for maslongbow-0.5.37.tar.gz
Algorithm Hash digest
SHA256 988ced9e7adac27fd4cc19d5d21add2effd406a4b95bc338c7f9bda40e64088e
MD5 d958fc7e5f07d42280c1741c15c271fc
BLAKE2b-256 9b7cf94977e5010665291f7cba579f3b4a242244a8cee69c91d74f193fd4ee97

See more details on using hashes here.

File details

Details for the file maslongbow-0.5.37-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for maslongbow-0.5.37-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 fab03e2fa78dac7fbb698e846c132f5cefe4c0a8863360f557f59b4566c602e8
MD5 4110b7cb534d111346f665296e3c0bb6
BLAKE2b-256 e8a117d4aa3f6595e1077f65d3fa4fd153ef615c9bba706a0b7159ec1ec38c40

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page