genet

GenET: Genome Editing Toolkit

These details have not been verified by PyPI

Project links

Project description

Genome Editing Toolkit
Since 2022. 08. 19.

Welcome to GenET

GenET (Genome Editing Toolkit) is a library of various python functions for the purpose of analyzing and evaluating data from genome editing experiments. GenET is still in its early stages of development and continue to improve and expand. Currently planned functions include guideRNA design, saturation library design, deep sequenced data analysis, and guide RNA activity prediction.

Please see the documentation.

Installation

# Create virtual env for genet
conda create -n genet python=3.10
conda activate genet

# Install genet
pip install genet

Who should use GenET?

GenET was developed for anyone interested in the field of genome editing. Especially, Genet can provide aid to those with the following objectives.:

Develop a quick and easy to design an genome editing experiment for a specific gene.
Perform genome editing analysis based on sequening data
Predict the activtiy of specific guideRNAs or all guideRNAs designed for editing a specific product.
Design a saturation library for a specific gene.

Example 1: Download genomic data from NCBI database

The genomic information required for research is often downloaded from public databases like NCBI. When only basic information is needed, searching on the NCBI website is usually sufficient. However, when a large amount of data is required or when specific reference sequence files are needed for certain analysis pipelines, it may be necessary to find and download files containing the specific information.

The GenET database module provides functions to easily download frequently used data files from NCBI. For example, to download the genomic assembly of Homo sapiens, you can use GetGenome as follows:

from genet.database import GetGenome

To create a GetGenome instance for the desired species:

# Specify the species
species = "Homo sapiens"

# Create a GetGenome instance
genome = GetGenome(species)

To check the available files related to the assembly of the specified species:

# Check available files
available_files = genome.contents()
print("Available files:", available_files)

Available files:
['README.txt', 'Annotation_comparison', 'GCF_000001405.40_GRCh38.p14_assembly_structure', 'GCF_000001405.40-RS_2023_10_annotation_report.xml', 'annotation_hashes.txt', 'RefSeq_transcripts_alignments', 'GCF_000001405.40_GRCh38.p14_assembly_regions.txt', ...]

To download the desired file with the specified name to the desired path:

# Specify the desired file name
file_name = "example_genome.fasta"

# Specify the desired download path
download_path = "/desired/download/path/"

# Download the file
genome.download(file_name, download_path)

Example 2: Prediction of prime editing efficiency by DeepPrime

DeepPrime is a prediction model for evaluating prime editing guideRNAs (pegRNAs) that target specific target sites for prime editing (Yu et al. Cell 2023). DeepSpCas9 prediction score is calculated simultaneously and requires tensorflow (version >=2.6). DeepPrime was developed on pytorch. For more details, please see the documentation.

from genet.predict import DeepPrime

seq = 'CCGAGTTGGTTCATCATCATTCAACGGTGGCCGACGGGCTCATCACCACGCTCCATTATC(C/T)AGCCCCAAAGCGCAACAAGCCCACTGTCTATGGTGTGTCCCCCAACTACGACAAGTGGGA'

pegrna = DeepPrime(seq)

# check designed pegRNAs
pegrna.features.head()

	ID	Spacer	RT-PBS	PBS_len	RTT_len	RT-PBS_len	Edit_pos	Edit_len	RHA_len	Target	...	deltaTm_Tm4-Tm2	GC_count_PBS	GC_count_RTT	GC_count_RT-PBS	GC_contents_PBS	GC_contents_RTT	GC_contents_RT-PBS	MFE_RT-PBS-polyT	DeepSpCas9_score
0	SampleName	GGTTCATCATCATTCAACGG	TAGATAATGGAGCGTGGTGATGAGCCCGTCGGCCACCGTTGAATG	7	38	45	37	1	1	AGTTGGTTCATCATCATTCAACGGTGGCCGACGGGCTCATCACCAC...	...	-510.285	2	23	25	28.57143	60.52632	55.55556	-12.7	76.43662
1	SampleName	GGTTCATCATCATTCAACGG	TAGATAATGGAGCGTGGTGATGAGCCCGTCGGCCACCGTTGAATGA	8	38	46	37	1	1	AGTTGGTTCATCATCATTCAACGGTGGCCGACGGGCTCATCACCAC...	...	-510.285	2	23	25	25	60.52632	54.34783	-11.4	76.43662
2	SampleName	GGTTCATCATCATTCAACGG	TAGATAATGGAGCGTGGTGATGAGCCCGTCGGCCACCGTTGAATGAT	9	38	47	37	1	1	AGTTGGTTCATCATCATTCAACGGTGGCCGACGGGCTCATCACCAC...	...	-510.285	2	23	25	22.22222	60.52632	53.19149	-11.4	76.43662
3	SampleName	GGTTCATCATCATTCAACGG	TAGATAATGGAGCGTGGTGATGAGCCCGTCGGCCACCGTTGAATGATG	10	38	48	37	1	1	AGTTGGTTCATCATCATTCAACGGTGGCCGACGGGCTCATCACCAC...	...	-510.285	3	23	26	30	60.52632	54.16667	-11.2	76.43662
4	SampleName	GGTTCATCATCATTCAACGG	TAGATAATGGAGCGTGGTGATGAGCCCGTCGGCCACCGTTGAATGATGA	11	38	49	37	1	1	AGTTGGTTCATCATCATTCAACGGTGGCCGACGGGCTCATCACCAC...	...	-510.285	3	23	26	27.27273	60.52632	53.06122	-11.2	76.43662

Next, select model PE system and run DeepPrime

pe2max_output = pegrna.predict(pe_system='PE2max', cell_type='HEK293T')

pe2max_output.head()

	ID	PE2max_score	Spacer	RT-PBS	PBS_len	RTT_len	RT-PBS_len	Edit_pos	Edit_len	RHA_len	Target
0	SampleName	2.143	GGTTCATCATCATTCAACGG	TAGATAATGGAGCGTGGTGATGAGCCCGTCGGCCACCGTTGAATG	7	38	45	37	1	1	AGTTGGTTCATCATCATTCAACGGTGGCCGACGGGCTCATCACCAC...
1	SampleName	3.140197	GGTTCATCATCATTCAACGG	TAGATAATGGAGCGTGGTGATGAGCCCGTCGGCCACCGTTGAATGA	8	38	46	37	1	1	AGTTGGTTCATCATCATTCAACGGTGGCCGACGGGCTCATCACCAC...
2	SampleName	2.541219	GGTTCATCATCATTCAACGG	TAGATAATGGAGCGTGGTGATGAGCCCGTCGGCCACCGTTGAATGAT	9	38	47	37	1	1	AGTTGGTTCATCATCATTCAACGGTGGCCGACGGGCTCATCACCAC...
3	SampleName	6.538445	GGTTCATCATCATTCAACGG	TAGATAATGGAGCGTGGTGATGAGCCCGTCGGCCACCGTTGAATGATG	10	38	48	37	1	1	AGTTGGTTCATCATCATTCAACGGTGGCCGACGGGCTCATCACCAC...
4	SampleName	7.436117	GGTTCATCATCATTCAACGG	TAGATAATGGAGCGTGGTGATGAGCCCGTCGGCCACCGTTGAATGATGA	11	38	49	37	1	1	AGTTGGTTCATCATCATTCAACGGTGGCCGACGGGCTCATCACCAC...

Please send all comments and questions to gsyu93@gmail.com

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.17.1

Oct 18, 2024

0.17.0

Oct 1, 2024

0.16.2

Oct 1, 2024

0.16.1

Sep 25, 2024

0.16.0

Sep 10, 2024

0.15.1

May 2, 2024

0.15.0

Apr 25, 2024

0.14.1

Mar 18, 2024

0.14.0

Feb 29, 2024

0.13.7

Feb 7, 2024

0.13.6

Jan 30, 2024

0.13.5

Jan 25, 2024

0.13.4

Jan 23, 2024

0.13.3

Jan 3, 2024

0.13.2

Jan 2, 2024

0.13.1

Dec 29, 2023

0.13.0

Dec 29, 2023

0.12.0

Dec 21, 2023

0.11.0

Nov 24, 2023

0.10.0 yanked

Oct 24, 2023

0.9.0 yanked

Aug 12, 2023

0.8.0 yanked

Aug 9, 2023

0.7.0 yanked

Jul 27, 2023

0.6.0 yanked

Jun 14, 2023

0.5.0 yanked

Dec 22, 2022

0.4.0 yanked

Dec 4, 2022

0.3.0 yanked

Nov 28, 2022

0.2.0 yanked

Nov 23, 2022

0.1.0 yanked

Nov 1, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

genet-0.17.1.tar.gz (121.4 kB view details)

Uploaded Oct 18, 2024 Source

Built Distribution

genet-0.17.1-py3-none-any.whl (119.5 kB view details)

Uploaded Oct 18, 2024 Python 3

File details

Details for the file genet-0.17.1.tar.gz.

File metadata

Download URL: genet-0.17.1.tar.gz
Upload date: Oct 18, 2024
Size: 121.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.10.14

File hashes

Hashes for genet-0.17.1.tar.gz
Algorithm	Hash digest
SHA256	`c7c079fc3d98f937815d05cf9dce0209c46f048c4ac07e90a861d05b907d1b1c`
MD5	`a114ed99a7181db490cf0e89ded3e2c2`
BLAKE2b-256	`9dc5177743a3e4a9aaee00e36e4cf7343ee2fc34c892011f19c3f81625638021`

See more details on using hashes here.

File details

Details for the file genet-0.17.1-py3-none-any.whl.

File metadata

Download URL: genet-0.17.1-py3-none-any.whl
Upload date: Oct 18, 2024
Size: 119.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.0.0 CPython/3.10.14

File hashes

Hashes for genet-0.17.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`02e7c2f3820ab55e1ed53525922d2ec94d896d76635604533c50873d6f40e58c`
MD5	`ae8c5b00443d88acfcc4839650057c55`
BLAKE2b-256	`645e22b3ba991106808a5ebeea3b8f006220bde9ca077e7f0256d664ef17aa08`