Skip to main content

Splice junction scoring tool

Project description

Italian Trulli

License: MIT version Downloads GitHub Downloads os

Splam is a splice junction recognition model based on a deep residual convolutional neural network that offers fast and precise assessment of splice junctions. It was trained on combined donor-acceptor pairs and focuses on a narrow window of 400 base pairs surrounding each splice site, inspired by the understanding that the splicing process primarily depends on signals within this region.

Why Splam❓#

  1. We need a tool to evaluate splice junctions & spliced alignments. Thousands of RNA-Seq datasets are generated every day, but there are no tools available for cleaning up spurious spliced alignments in these data. Splam addresses this problem!
  2. Splam-cleaned alignments lead to improved transcript assembly, which, in turn, may enhance all downstream RNA-Seq analyses, including transcript quantification, differential gene expression analysis, and more.

Who is it for❓#

If you are (1) doing RNA-Seq data analysis or (2) seeking a trustworthy way to evaluate splice junctions (introns), then Splam is the tool that you are looking for!


What does Splam do❓#

There are two main use case scenarios:

  1. Improving your alignment file. Splam evaluates the quality of spliced alignments and removes those containing spurious splice junctions. This significantly enhances the quality of downstream transcriptome assemblies [Link].

  2. Evaluating the quality of introns in your annotation file or assembled transcripts [Link].


Documentation#

📒 The full user manual is available here

Table of contents#


Installation#

Splam is on PyPi. This is the easiest installation approach. Check out all the releases here.

$ pip install splam

You can also install Splam from source

$ git clone https://github.com/Kuanhao-Chao/splam --recursive

$ cd splam/src/

$ python setup.py install

Quick Start#

Running Splam is simple. It only requires three lines of code!

See these examples on Google Colab:

Example 1: clean up alignment files (BAM)

$ cd test

# Step 1: extract splice junctions in the alignment file
$ splam extract -P SRR1352129_chr9_sub.bam -o tmp_out_alignment

# Step 2: score all the extracted splice junctions
$ splam score -G chr9_subset.fa -m ../model/splam_script.pt -o tmp_out_alignment tmp_out_alignment/junction.bed

#Step 3: output a cleaned and sorted alignment file
$ splam clean -o tmp_out_alignment

Example 2: evaluate annotation files / assembled transcripts (GFF)

$ cd test

# Step 1: extract introns in the annotation
$ splam extract refseq_40_GRCh38.p14_chr_fixed.gff -o tmp_out_annotation

# Step 2: score introns in the annotation
$ splam score -G chr9_subset.fa -m ../model/splam_script.pt -o tmp_out_annotation tmp_out_annotation/junction.bed

#Step 3: output statistics of each transcript
$ splam clean -o tmp_out_annotation

Example 3: evaluate mouse annotation files (GFF)

$ cd test

# Step 1: extract introns in the annotation
$ splam extract mouse_chr19.gff -o tmp_out_generalization

# Step 2: score introns in the annotation
$ splam score -A GRCm39_assembly_report.txt -G mouse_chr19.fa -m ../model/splam_script.pt -o tmp_out_generalization tmp_out_generalization/junction.bed

# Step 3: output statistics of each transcript
$ splam clean -o tmp_out_generalization

Scripts for Splam model training & analysis#

All the scripts for Splam training and data analysis are in this GitHub repository.


Citation#

Kuan-Hao Chao*, Alan Mao, Steven L Salzberg, Mihaela Pertea*, "Splam: a deep-learning-based splice site predictor that improves spliced alignments ", bioRxiv 2023.07.27.550754, doi: https://doi.org/10.1101/2023.07.27.550754, 2023

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

splam-1.0.8.tar.gz (41.0 MB view details)

Uploaded Source

Built Distribution

splam-1.0.8-cp38-cp38-macosx_13_0_arm64.whl (946.7 kB view details)

Uploaded CPython 3.8 macOS 13.0+ ARM64

File details

Details for the file splam-1.0.8.tar.gz.

File metadata

  • Download URL: splam-1.0.8.tar.gz
  • Upload date:
  • Size: 41.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.11

File hashes

Hashes for splam-1.0.8.tar.gz
Algorithm Hash digest
SHA256 e6ba75c6f6ac3a1312c4447f9c6384b9aab0a2ee1a6e7f4d9a05ffe57cc8bfae
MD5 2d6e8ca97c9c1a16c20185d223e87411
BLAKE2b-256 a5ad6d3495d79895726b6df06f3cd667006e5316fa3235e6919b841057f6e666

See more details on using hashes here.

Provenance

File details

Details for the file splam-1.0.8-cp38-cp38-macosx_13_0_arm64.whl.

File metadata

File hashes

Hashes for splam-1.0.8-cp38-cp38-macosx_13_0_arm64.whl
Algorithm Hash digest
SHA256 29c00038f58c6d294784dd7a212be258c47723fee343964eff8e67a6f5d8aae1
MD5 b9d4b389054f652c8a767f7ee60ba5a6
BLAKE2b-256 35fa76bf21536e4f60a1caae0132889ab74457d9c75a9476c3d0ec7b578870af

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page