Skip to main content

A splice site preditction toolkit

Project description

https://img.shields.io/badge/License-GPLv3-yellow.svg https://img.shields.io/badge/version-v.0.0.1-blue https://static.pepy.tech/personalized-badge/lifton?period=total&units=abbreviation&left_color=grey&right_color=blue&left_text=PyPi%20downloads https://img.shields.io/github/downloads/Kuanhao-Chao/lifton/total.svg?style=social&logo=github&label=Download https://img.shields.io/badge/platform-macOS_/Linux-green.svg https://colab.research.google.com/assets/colab-badge.svg


The SpliceAI-toolkit is a flexible framework designed for easy retraining of the SpliceAI model with new datasets. It comes with models pre-trained on various species, including humans (MANE database), mice, thale cress (Arabidopsis), honey bees, and zebrafish. Additionally, the SpliceAI-toolkit is capable of processing genetic variants in VCF format to predict their impact on splicing.

Why SpliceAI-toolkit❓#

  1. Easy-to-retrain framework: Transitioning from the outdated Python 2.7, along with older versions of TensorFlow and Keras, the SpliceAI-toolkit is built on Python 3.7 and leverages the powerful PyTorch library. This simplifies the retraining process significantly. Say goodbye to compatibility issues and hello to efficiency — retrain your models with just two simple commands.

  2. Pretrained on new dataset: SpliceAI is great, but SpliceAI-toolkit makes it even better! Pretrained with the latest MANE annotations (released in 2022), it ensures your research is powered by the most accurate and up-to-date genomic information available.

  3. Pretrained on various species: Concerned that the SpliceAI model does not generalize to your study species because you are not studying humans? No problem! The SpliceAI-toolkit is released with models pretrained on various species, including human MANE, mouse, thale cress, honey bee, and zebrafish.

  4. Predict the impact of genetic variants on splicing: Similar to SpliceAI, the SpliceAI-toolkit can take genetic variants in VCF format and predict the impact of these variants on splicing with any of the pretrained models.

SpliceAI-toolkit is open-source, free, and combines the ease of Python with the power of PyTorch for accurate splicing predictions.


Who is it for❓#

  1. If you want to study splicing in humans, just use the newly pretrained human SpliceAI-MANE! Better annotation, better results!

  2. If you want to do splicing research in other species, the SpliceAI-toolkit has you covered! It comes with models pretrained on various species! And you can easily train your own SpliceAI with your own genome & annotation data.

  3. If you are interested in predicting the impact of genetic variants on splicing, SpliceAI-toolkit is the perfect tool for you!


What does SpliceAI-toolkit do❓#

  • The spliceai-toolkit create-data command takes a genome and annotation file as input and generates a dataset for training and testing your SpliceAI model.

  • The spliceai-toolkit train command uses the created dataset to train your own SpliceAI model.

  • The spliceai-toolkit predict command takes a random gene sequence and predicts the score of each position, determining whether it is a donor, acceptor, or neither.

  • The spliceai-toolkit variant command takes a VCF file and predicts the impact of genetic variants on splicing.


Cite us#

Chao, Kua-Hao, Alan Mao, Anqi Liu, Mihaela Pertea, and Steven L. Salzberg. "SpliceAI-toolkit" bioRxiv.

Jaganathan, K., Panagiotopoulou, S.K., McRae, J.F., Darbandi, S.F., Knowles, D., Li, Y.I., Kosmicki, J.A., Arbelaez, J., Cui, W., Schwartz, G.B. and Chow, E.D."Predicting splicing from primary sequence with deep learning" Cell.


User support#

Please go through the documentation below first. If you have questions about using the package, a bug report, or a feature request, please use the GitHub issue tracker here:

https://github.com/Kuanhao-Chao/spliceAI-toolkit/issues


Key contributors#

SpliceAI-toolkit was designed and developed by Kuan-Hao Chao. This documentation was written by Kuan-Hao Chao. The LiftOn logo was designed by Kuan-Hao Chao.


Table of contents#









Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spliceai-toolkit-0.0.1.tar.gz (45.0 kB view hashes)

Uploaded Source

Built Distribution

spliceai_toolkit-0.0.1-py3-none-any.whl (71.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page