Skip to main content

A splice site preditction toolkit

Project description

https://img.shields.io/badge/License-GPLv3-yellow.svg https://img.shields.io/badge/version-v.0.0.1-blue https://static.pepy.tech/personalized-badge/lifton?period=total&units=abbreviation&left_color=grey&right_color=blue&left_text=PyPi%20downloads https://img.shields.io/github/downloads/Kuanhao-Chao/lifton/total.svg?style=social&logo=github&label=Download https://img.shields.io/badge/platform-macOS_/Linux-green.svg https://colab.research.google.com/assets/colab-badge.svg


The SpliceAI-toolkit is a flexible framework designed for easy retraining of the SpliceAI model with new datasets. It comes with models pre-trained on various species, including humans (MANE database), mice, thale cress (Arabidopsis), honey bees, and zebrafish. Additionally, the SpliceAI-toolkit is capable of processing genetic variants in VCF format to predict their impact on splicing.

Why SpliceAI-toolkit❓#

  1. Easy-to-retrain framework: Transitioning from the outdated Python 2.7, along with older versions of TensorFlow and Keras, the SpliceAI-toolkit is built on Python 3.7 and leverages the powerful PyTorch library. This simplifies the retraining process significantly. Say goodbye to compatibility issues and hello to efficiency — retrain your models with just two simple commands.

  2. Pretrained on new dataset: SpliceAI is great, but SpliceAI-toolkit makes it even better! Pretrained with the latest MANE annotations (released in 2022), it ensures your research is powered by the most accurate and up-to-date genomic information available.

  3. Pretrained on various species: Concerned that the SpliceAI model does not generalize to your study species because you are not studying humans? No problem! The SpliceAI-toolkit is released with models pretrained on various species, including human MANE, mouse, thale cress, honey bee, and zebrafish.

  4. Predict the impact of genetic variants on splicing: Similar to SpliceAI, the SpliceAI-toolkit can take genetic variants in VCF format and predict the impact of these variants on splicing with any of the pretrained models.

SpliceAI-toolkit is open-source, free, and combines the ease of Python with the power of PyTorch for accurate splicing predictions.


Who is it for❓#

  1. If you want to study splicing in humans, just use the newly pretrained human SpliceAI-MANE! Better annotation, better results!

  2. If you want to do splicing research in other species, the SpliceAI-toolkit has you covered! It comes with models pretrained on various species! And you can easily train your own SpliceAI with your own genome & annotation data.

  3. If you are interested in predicting the impact of genetic variants on splicing, SpliceAI-toolkit is the perfect tool for you!


What does SpliceAI-toolkit do❓#

  • The spliceai-toolkit create-data command takes a genome and annotation file as input and generates a dataset for training and testing your SpliceAI model.

  • The spliceai-toolkit train command uses the created dataset to train your own SpliceAI model.

  • The spliceai-toolkit predict command takes a random gene sequence and predicts the score of each position, determining whether it is a donor, acceptor, or neither.

  • The spliceai-toolkit variant command takes a VCF file and predicts the impact of genetic variants on splicing.


Cite us#

Chao, Kua-Hao, Alan Mao, Anqi Liu, Mihaela Pertea, and Steven L. Salzberg. "SpliceAI-toolkit" bioRxiv.

Jaganathan, K., Panagiotopoulou, S.K., McRae, J.F., Darbandi, S.F., Knowles, D., Li, Y.I., Kosmicki, J.A., Arbelaez, J., Cui, W., Schwartz, G.B. and Chow, E.D."Predicting splicing from primary sequence with deep learning" Cell.


User support#

Please go through the documentation below first. If you have questions about using the package, a bug report, or a feature request, please use the GitHub issue tracker here:

https://github.com/Kuanhao-Chao/spliceAI-toolkit/issues


Key contributors#

SpliceAI-toolkit was designed and developed by Kuan-Hao Chao. This documentation was written by Kuan-Hao Chao. The LiftOn logo was designed by Kuan-Hao Chao.


Table of contents#









Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spliceai-toolkit-0.0.1.tar.gz (45.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spliceai_toolkit-0.0.1-py3-none-any.whl (71.0 kB view details)

Uploaded Python 3

File details

Details for the file spliceai-toolkit-0.0.1.tar.gz.

File metadata

  • Download URL: spliceai-toolkit-0.0.1.tar.gz
  • Upload date:
  • Size: 45.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.11

File hashes

Hashes for spliceai-toolkit-0.0.1.tar.gz
Algorithm Hash digest
SHA256 bba132fd459b4951d189b087a931b988bf246c414379c6476cae2e2ac9f215a8
MD5 af233b8bb9e06085d34a37268c5985f7
BLAKE2b-256 45d5097ad9be037deab3b2b8d9b17c8a3aac86f8579d849afb65bd925127df70

See more details on using hashes here.

File details

Details for the file spliceai_toolkit-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for spliceai_toolkit-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2608d0b17e19588c8a52fa99259cacdc4edddb753bd5a99fa6496df41b0dcf55
MD5 63308d2bb3a60bc2c4f1e9e5efeb63b0
BLAKE2b-256 a984aa5aecc4f2b874e20b18a8d0bbfe4622fa87384f9a366e9ba4da5b324cbd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page