A splice site preditction toolkit
Project description
The SpliceAI-toolkit is a flexible framework designed for easy retraining of the SpliceAI model with new datasets. It comes with models pre-trained on various species, including humans (MANE database), mice, thale cress (Arabidopsis), honey bees, and zebrafish. Additionally, the SpliceAI-toolkit is capable of processing genetic variants in VCF format to predict their impact on splicing.
Why SpliceAI-toolkit❓#
Easy-to-retrain framework: Transitioning from the outdated Python 2.7, along with older versions of TensorFlow and Keras, the SpliceAI-toolkit is built on Python 3.7 and leverages the powerful PyTorch library. This simplifies the retraining process significantly. Say goodbye to compatibility issues and hello to efficiency — retrain your models with just two simple commands.
Pretrained on new dataset: SpliceAI is great, but SpliceAI-toolkit makes it even better! Pretrained with the latest MANE annotations (released in 2022), it ensures your research is powered by the most accurate and up-to-date genomic information available.
Pretrained on various species: Concerned that the SpliceAI model does not generalize to your study species because you are not studying humans? No problem! The SpliceAI-toolkit is released with models pretrained on various species, including human MANE, mouse, thale cress, honey bee, and zebrafish.
Predict the impact of genetic variants on splicing: Similar to SpliceAI, the SpliceAI-toolkit can take genetic variants in VCF format and predict the impact of these variants on splicing with any of the pretrained models.
SpliceAI-toolkit is open-source, free, and combines the ease of Python with the power of PyTorch for accurate splicing predictions.
Who is it for❓#
If you want to study splicing in humans, just use the newly pretrained human SpliceAI-MANE! Better annotation, better results!
If you want to do splicing research in other species, the SpliceAI-toolkit has you covered! It comes with models pretrained on various species! And you can easily train your own SpliceAI with your own genome & annotation data.
If you are interested in predicting the impact of genetic variants on splicing, SpliceAI-toolkit is the perfect tool for you!
What does SpliceAI-toolkit do❓#
The spliceai-toolkit
create-datacommand takes a genome and annotation file as input and generates a dataset for training and testing your SpliceAI model.The spliceai-toolkit
traincommand uses the created dataset to train your own SpliceAI model.The spliceai-toolkit
predictcommand takes a random gene sequence and predicts the score of each position, determining whether it is a donor, acceptor, or neither.The spliceai-toolkit
variantcommand takes a VCF file and predicts the impact of genetic variants on splicing.
Cite us#
Chao, Kua-Hao, Alan Mao, Anqi Liu, Mihaela Pertea, and Steven L. Salzberg. "SpliceAI-toolkit" bioRxiv.
Jaganathan, K., Panagiotopoulou, S.K., McRae, J.F., Darbandi, S.F., Knowles, D., Li, Y.I., Kosmicki, J.A., Arbelaez, J., Cui, W., Schwartz, G.B. and Chow, E.D."Predicting splicing from primary sequence with deep learning" Cell.
User support#
Please go through the documentation below first. If you have questions about using the package, a bug report, or a feature request, please use the GitHub issue tracker here:
https://github.com/Kuanhao-Chao/spliceAI-toolkit/issues
Key contributors#
SpliceAI-toolkit was designed and developed by Kuan-Hao Chao. This documentation was written by Kuan-Hao Chao. The LiftOn logo was designed by Kuan-Hao Chao.
Table of contents#
Examples
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spliceai-toolkit-0.0.1.tar.gz.
File metadata
- Download URL: spliceai-toolkit-0.0.1.tar.gz
- Upload date:
- Size: 45.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bba132fd459b4951d189b087a931b988bf246c414379c6476cae2e2ac9f215a8
|
|
| MD5 |
af233b8bb9e06085d34a37268c5985f7
|
|
| BLAKE2b-256 |
45d5097ad9be037deab3b2b8d9b17c8a3aac86f8579d849afb65bd925127df70
|
File details
Details for the file spliceai_toolkit-0.0.1-py3-none-any.whl.
File metadata
- Download URL: spliceai_toolkit-0.0.1-py3-none-any.whl
- Upload date:
- Size: 71.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2608d0b17e19588c8a52fa99259cacdc4edddb753bd5a99fa6496df41b0dcf55
|
|
| MD5 |
63308d2bb3a60bc2c4f1e9e5efeb63b0
|
|
| BLAKE2b-256 |
a984aa5aecc4f2b874e20b18a8d0bbfe4622fa87384f9a366e9ba4da5b324cbd
|