GenET: Genome Editing Toolkit
Project description
Genome Editing Toolkit
Since 2022. 08. 19.
Welcome to GenET
GenET (Genome Editing Toolkit) is a library of various python functions for the purpose of analyzing and evaluating data from genome editing experiments. GenET is still in its early stages of development and continue to improve and expand. Currently planned functions include guideRNA design, saturation library design, deep sequenced data analysis, and guide RNA activity prediction.
Please see the documentation.
Installation
1/ Create virtual environment and install genet
# Create virtual env for genet. (python 3.8 was tested)
conda create -n genet python=3.8
conda activate genet
# Install genet ( >= ver. 0.7.0)
pip install genet
2/ Install Pytorch (v1.11.0 was tested)
Pytorch ver.2 is not compatible yet.
# For OSX (MacOS)
pip install torch==1.11.0
# For Linux and Windows
# CUDA 11.3 (choose version degending on your GPU)
pip install torch==1.11.0+cu113 --extra-index-url https://download.pytorch.org/whl/cu113
# CPU only
pip install torch==1.11.0+cpu --extra-index-url https://download.pytorch.org/whl/cpu
3/ Install ViennaRNA
# install ViennaRNA package for prediction module
conda install viennarna
Who should use GenET?
GenET was developed for anyone interested in the field of genome editing. Especially, Genet can provide aid to those with the following objectives.:
- Develop a quick and easy to design an genome editing experiment for a specific gene.
- Perform genome editing analysis based on sequening data
- Predict the activtiy of specific guideRNAs or all guideRNAs designed for editing a specific product.
Example: Prediction of prime editing efficiency by DeepPrime
DeepPrime is a prediction model for evaluating prime editing guideRNAs (pegRNAs) that target specific target sites for prime editing (Yu et al. Cell 2023). DeepSpCas9 prediction score is calculated simultaneously and requires tensorflow (version >=2.6). DeepPrime was developed on pytorch.
from genet.predict import DeepPrime
seq_wt = 'ATGACAATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGAAGAACTATAACCTGCAAATGTCAACTGAAACCTTAAAGTGAGTATTTAATTGAGCTGAAGT'
seq_ed = 'ATGACAATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGACGAACTATAACCTGCAAATGTCAACTGAAACCTTAAAGTGAGTATTTAATTGAGCTGAAGT'
pegrna = DeepPrime('Test', seq_wt, seq_ed, edit_type='sub', edit_len=1)
# check designed pegRNAs
>>> pegrna.features
ID | WT74_On | Edited74_On | PBSlen | RTlen | RT-PBSlen | Edit_pos | Edit_len | RHA_len | type_sub | type_ins | type_del | Tm1 | Tm2 | Tm2new | Tm3 | Tm4 | TmD | nGCcnt1 | nGCcnt2 | nGCcnt3 | fGCcont1 | fGCcont2 | fGCcont3 | MFE3 | MFE4 | DeepSpCas9_score | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Test | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGAAGAACTATAACCTGCAAATG | xxxxxxxxxxxxxxCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGACGxxxxxxxxxxxxxxxxxx | 7 | 35 | 42 | 34 | 1 | 1 | 1 | 0 | 0 | 16.19097 | 62.1654 | 62.1654 | -277.939 | 58.22525 | -340.105 | 5 | 16 | 21 | 71.42857 | 45.71429 | 50 | -10.4 | -0.6 | 45.96754 |
1 | Test | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGAAGAACTATAACCTGCAAATG | xxxxxxxxxxxxxCCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGACGxxxxxxxxxxxxxxxxxx | 8 | 35 | 43 | 34 | 1 | 1 | 1 | 0 | 0 | 30.19954 | 62.1654 | 62.1654 | -277.939 | 58.22525 | -340.105 | 6 | 16 | 22 | 75 | 45.71429 | 51.16279 | -10.4 | -0.6 | 45.96754 |
2 | Test | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGAAGAACTATAACCTGCAAATG | xxxxxxxxxxxxACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGACGxxxxxxxxxxxxxxxxxx | 9 | 35 | 44 | 34 | 1 | 1 | 1 | 0 | 0 | 33.78395 | 62.1654 | 62.1654 | -277.939 | 58.22525 | -340.105 | 6 | 16 | 22 | 66.66667 | 45.71429 | 50 | -10.4 | -0.6 | 45.96754 |
3 | Test | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGAAGAACTATAACCTGCAAATG | xxxxxxxxxxxCACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGACGxxxxxxxxxxxxxxxxxx | 10 | 35 | 45 | 34 | 1 | 1 | 1 | 0 | 0 | 38.51415 | 62.1654 | 62.1654 | -277.939 | 58.22525 | -340.105 | 7 | 16 | 23 | 70 | 45.71429 | 51.11111 | -10.4 | -0.6 | 45.96754 |
4 | Test | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGAAGAACTATAACCTGCAAATG | xxxxxxxxxxACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGACGxxxxxxxxxxxxxxxxxx | 11 | 35 | 46 | 34 | 1 | 1 | 1 | 0 | 0 | 40.87411 | 62.1654 | 62.1654 | -277.939 | 58.22525 | -340.105 | 7 | 16 | 23 | 63.63636 | 45.71429 | 50 | -10.4 | -0.6 | 45.96754 |
5 | Test | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGAAGAACTATAACCTGCAAATG | xxxxxxxxxAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGACGxxxxxxxxxxxxxxxxxx | 12 | 35 | 47 | 34 | 1 | 1 | 1 | 0 | 0 | 40.07098 | 62.1654 | 62.1654 | -277.939 | 58.22525 | -340.105 | 7 | 16 | 23 | 58.33333 | 45.71429 | 48.93617 | -10.4 | -0.6 | 45.96754 |
Next, select model PE system and run DeepPrime
pe2max_output = pegrna.predict(pe_system='PE2max', cell_type='HEK293T')
>>> pe2max_output.head()
Target | Spacer | RT-PBS | PBSlen | RTlen | RT-PBSlen | Edit_pos | Edit_len | RHA_len | PE2max_score | |
---|---|---|---|---|---|---|---|---|---|---|
0 | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGA... | ATAAAAGACAACACCCTTGCCTTGTGGAGT | CGTCTCAGTTTCTGGGAGCTTTGAAAACTCCACAAGGCAAGG | 7 | 35 | 42 | 34 | 1 | 1 | 0.904907 |
1 | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGA... | ATAAAAGACAACACCCTTGCCTTGTGGAGT | CGTCTCAGTTTCTGGGAGCTTTGAAAACTCCACAAGGCAAGGG | 8 | 35 | 43 | 34 | 1 | 1 | 2.377118 |
2 | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGA... | ATAAAAGACAACACCCTTGCCTTGTGGAGT | CGTCTCAGTTTCTGGGAGCTTTGAAAACTCCACAAGGCAAGGGT | 9 | 35 | 44 | 34 | 1 | 1 | 2.613841 |
3 | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGA... | ATAAAAGACAACACCCTTGCCTTGTGGAGT | CGTCTCAGTTTCTGGGAGCTTTGAAAACTCCACAAGGCAAGGGTG | 10 | 35 | 45 | 34 | 1 | 1 | 3.643573 |
4 | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGA... | ATAAAAGACAACACCCTTGCCTTGTGGAGT | CGTCTCAGTTTCTGGGAGCTTTGAAAACTCCACAAGGCAAGGGTGT | 11 | 35 | 46 | 34 | 1 | 1 | 3.770234 |
The previous function, pe_score()
, is still available for use. However, please note that this function will be deprecated in the near future.
from genet import predict as prd
# Place WT sequence and Edited sequence information, respectively.
# And select the edit type you want to make and put it in.
#Input seq: 60bp 5' context + 1bp center + 60bp 3' context (total 121bp)
seq_wt = 'ATGACAATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGAAGAACTATAACCTGCAAATGTCAACTGAAACCTTAAAGTGAGTATTTAATTGAGCTGAAGT'
seq_ed = 'ATGACAATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGAAACTGAGACGAACTATAACCTGCAAATGTCAACTGAAACCTTAAAGTGAGTATTTAATTGAGCTGAAGT'
alt_type = 'sub1'
df_pe = prd.pe_score(seq_wt, seq_ed, alt_type)
df_pe.head()
Target | Spacer | RT-PBS | PBSlen | RTlen | RT-PBSlen | Edit_pos | Edit_len | RHA_len | PE2max_score | |
---|---|---|---|---|---|---|---|---|---|---|
0 | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGA... | ATAAAAGACAACACCCTTGCCTTGTGGAGT | CGTCTCAGTTTCTGGGAGCTTTGAAAACTCCACAAGGCAAGG | 7 | 35 | 42 | 34 | 1 | 1 | 0.904907 |
1 | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGA... | ATAAAAGACAACACCCTTGCCTTGTGGAGT | CGTCTCAGTTTCTGGGAGCTTTGAAAACTCCACAAGGCAAGGG | 8 | 35 | 43 | 34 | 1 | 1 | 2.377118 |
2 | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGA... | ATAAAAGACAACACCCTTGCCTTGTGGAGT | CGTCTCAGTTTCTGGGAGCTTTGAAAACTCCACAAGGCAAGGGT | 9 | 35 | 44 | 34 | 1 | 1 | 2.613841 |
3 | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGA... | ATAAAAGACAACACCCTTGCCTTGTGGAGT | CGTCTCAGTTTCTGGGAGCTTTGAAAACTCCACAAGGCAAGGGTG | 10 | 35 | 45 | 34 | 1 | 1 | 3.643573 |
4 | ATAAAAGACAACACCCTTGCCTTGTGGAGTTTTCAAAGCTCCCAGA... | ATAAAAGACAACACCCTTGCCTTGTGGAGT | CGTCTCAGTTTCTGGGAGCTTTGAAAACTCCACAAGGCAAGGGTGT | 11 | 35 | 46 | 34 | 1 | 1 | 3.770234 |
It is also possible to predict other cell lines (A549, DLD1...) and PE systems (PE2max, PE4max...).
df_pe = prd.pe_score(seq_wt, seq_ed, alt_type, sID='MyGene', pe_system='PE4max', cell_type='A549')
Please send all comments and questions to gsyu93@gmail.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file genet-0.13.3.tar.gz
.
File metadata
- Download URL: genet-0.13.3.tar.gz
- Upload date:
- Size: 75.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6b13c92ca7c29925dcca328ae1bc9a61261cee72b1d107e5e5fc641ae775036d |
|
MD5 | b7ca102113fa56e9046cae537405eab0 |
|
BLAKE2b-256 | 5b76a66e71eeefaaf39a2226e7b5ed40113fde7b846ce184a26314b277cf5159 |
File details
Details for the file genet-0.13.3-py3-none-any.whl
.
File metadata
- Download URL: genet-0.13.3-py3-none-any.whl
- Upload date:
- Size: 86.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 79438fb491f63f63b28734f11214ab48ab3ff8dfb5ff1c0605fc45a093f1b8c4 |
|
MD5 | 4e1ddeedd6eda4dad84d1c1242752737 |
|
BLAKE2b-256 | 8e4b2dbfd2515ec5273e12d761cd75850b49a33764bf63ed684e38df56a49d60 |