Skip to main content

This package designs bridgeRNAs targeting a given locus and and provides metrics to evaluate their efficiencies and specificities

Project description

BridgeEvaluator

This package designs bridgeRNAs targeting a given locus and and provides metrics to evaluate their efficiencies and specificities

Author: Jaymin Patel: jayman1466@gmail.com

Installation

Streamlined Installation

You can install this package via pip:

pip install BridgeEvaluator

You must also install the Vienna RNA Suite if you want to score the predicted folding of the designed bRNAs. If you don't have this installed, set score_structure=False in the design_bridges() command.

If you are getting a package not found error, this can usually be fixed by updating your python and pip versions. It may make sense to do this within a new conda environment.

Manual Installation

Alternatively, for manual installation, you can place the files from the "src/BridgeEvaluator/" directory directly into your working directory. If you use this manual installation, make sure you have the following dependencies installed with a python version >=3.9:

biopython >= 1.85

Levenshtein >= 0.27.1

viennarna >= 2.7.0

pandas >= 2.2.0

Usage

The simplest usage within a python script is as follows:

from bridge_evaluator import design_bridges

#Specify the locus that will be scanned to design bRNAs
target_locus = "ATGAGCAAAGGAGAAGAACTTTTCACTGGAGTTGTCCCAATTCTTGTTGAATTAGATGGTGATGTTAATGGGCACAAATTTTCTGTCCGTGGAGAGGGTGAAGGTGATGCTACAAACGGAAAACTCACCCTTAAATTTATTTGCACTACTGGAAAACTACCTGTTCCGTGGCCAACACTTGTCACTACTCTGACCTATGGTGTTCAATGCTTTTCCCGTTATCCGGATCACATGAAACGGCATGACTTTTTCAAGAGTGCCATGCCCGAAGGTTATGTACAGGAACGCACTATATCTTTCAAAGATGACGGGACCTACAAGACGCGTGCTGAAGTCAAGTTTGAAGGTGATACCCTTGTTAATCGTATCGAGTTAAAGGGTATTGATTTTAAAGAAGATGGAAACATTCTTGGACACAAACTCGAGTACAACTTTAACTCACACAATGTATACATCACGGCAGACAAACAAAAGAATGGAATCAAAGCTAACTTCAAAATTCGCCACAACGTTGAAGATGGTTCCGTTCAACTAGCAGACCATTATCAACAAAATACTCCAATTGGCGATGGCCCTGTCCTTTTACCAGACAACCATTACCTGTCGACACAATCTGTCCTTTCGAAAGATCCCAACGAAAAGCGTGACCACATGGTCCTTCTTGAGTTTGTAACTGCTGCTGGGATTACACATGGCATGGATGAGCTCTACAAAtaa"

#Name of this locus. The results will be outputted with this filename
target_name = "sfGFP"

#Genbank file of the full genome of the target. This will be used to evaluate possible off targets 
genbank_file = "MG1655.gb"

design_bridges(target_locus, target_name, genbank_file)

Required Parameters

target_locus: Nucleotide sequence of the locus you want to target. This script will scan this locus to identify all permissive 14mer target sites and provide attributes to score their predicted efficiency and specificity.

target_name: Name of the locus you are targeting. This will be used to name the designed bridgeRNAs and the output file of the script. bridgeRNA names are given the syntax: bridge_IS621_T_{target_name}_{index}D{donor_name}. The {target_name} and {donor_name} are pulled from the arguments of this function.

genbank_file: Genbank file of the recipient genome. This will be used to identify potential off target sites for each designed bridgeRNA.

Optional Parameters

donor_seq: Sequence of the Donor Sequence (14mer) being used. Default is the native IS621 donor "ACAGTATCTTGTAT"

donor_name: Name of the Donor. This will be used to name the designed bridgeRNAs. Default is "1"

cores: Core sequences that can be used, provided as a list. Note, the core of the provided donor sequence will be modified to match the core of the target sequence for each designed bridgeRNA. Default is ['CT']

kmer: The first X bp of the target sequence that will be used to identify perfect and imperfect offtargets in the recipient genome. Default is 11

avoid_restriction: Restriction sites (and other sequences) to avoid in the designed bridgeRNAs, provided as a list. Reverse complements must be provided manually. Default is []

check_imperfect: Should this script look for imperfect offtarget sequences in the recipient genome? This increases computational time dramatically. All imperfect offtargets with a Levenshtein distance <= 2 are tabulated. Indels are given a Levenshtein score of 2 and mismatches are given a Levenshtein score of 1. Default is True

score_structure: Should this script evaluate the predicted secondary structure of the designed bridgeRNAs? Predicted MFE structures of designed bridgeRNAs are compared to the reference native IS621 secondary structure using Vienna RNA's RNAforester forest-based structural aligner. A similarity score from 0 to 1 is provided. Default is True

feature_type: For internal metadata, you can include a feature type (eg CDS, ncRNA) for the locus you are targeting. Default is ""

primer_seqs: In addition to full bridgeRNA sequences, this script converts these sequences into DNA fragments for synthesis which can be cloned into the bsaI sites in Patel et al. IS110 vectors. If you'd like to include PCR priming sites to amplify these fragments, you can specify them as a dictionary of lists, segmented by core sequence, as follows: {'CT': ["for_primer_seq1", "rev_primer_seq1"], 'GT': ["for_primer_seq2", "rev_primer_seq1"]}. Default is {"CT": ["",""], "GT": ["",""], "AT": ["",""], "TT": ["",""]}

Example python script utilizing some optional parameters:

from bridge_evaluator import design_bridges

#Specify the locus that will be scanned to design bRNAs
target_locus = "ATGAGCAAAGGAGAAGAACTTTTCACTGGAGTTGTCCCAATTCTTGTTGAATTAGATGGTGATGTTAATGGGCACAAATTTTCTGTCCGTGGAGAGGGTGAAGGTGATGCTACAAACGGAAAACTCACCCTTAAATTTATTTGCACTACTGGAAAACTACCTGTTCCGTGGCCAACACTTGTCACTACTCTGACCTATGGTGTTCAATGCTTTTCCCGTTATCCGGATCACATGAAACGGCATGACTTTTTCAAGAGTGCCATGCCCGAAGGTTATGTACAGGAACGCACTATATCTTTCAAAGATGACGGGACCTACAAGACGCGTGCTGAAGTCAAGTTTGAAGGTGATACCCTTGTTAATCGTATCGAGTTAAAGGGTATTGATTTTAAAGAAGATGGAAACATTCTTGGACACAAACTCGAGTACAACTTTAACTCACACAATGTATACATCACGGCAGACAAACAAAAGAATGGAATCAAAGCTAACTTCAAAATTCGCCACAACGTTGAAGATGGTTCCGTTCAACTAGCAGACCATTATCAACAAAATACTCCAATTGGCGATGGCCCTGTCCTTTTACCAGACAACCATTACCTGTCGACACAATCTGTCCTTTCGAAAGATCCCAACGAAAAGCGTGACCACATGGTCCTTCTTGAGTTTGTAACTGCTGCTGGGATTACACATGGCATGGATGAGCTCTACAAAtaa"

#Name of this locus. The results will be outputted with this filename
target_name = "sfGFP"

#Genbank file of the full genome of the target. This will be used to evaluate possible off targets 
genbank_file = "MG1655.gb"

#Donor A is the native Donor of IS621
donor_seq = 'ACAGTATCTTGTAT' #donor A

#Cores to use
cores = ["CT","GT"]

#avoid the following sequences - manually include the reverse complement
avoid_restriction = ["GGTCTC","GAGACC","GCTCTTC","GAAGAGC"]

design_bridges(target_locus, target_name, genbank_file, donor_seq = donor_seq, cores = cores, avoid_restriction = avoid_restriction)

Output

The results are exported in the present directory as a csv file named {target_name}.csv. Each row represents a potential bridgeRNA that can target the provided locus

Output Columns:

target_gene: Name of locus being targeted

feature_type: Feature type of locus being targeted

donor_seq: Donor Sequence (14mer)

target_seq: Target Sequence (14mer)

index: Position of the Target Sequence relative to target locus

strand: Orientation of the Target Sequence (+ or -) relative to the target locus

core: Core sequence being used

perfect_match_targets: Number of perfect matches of the Target Sequence to the provided recipient genome. Note, only the first X bp specified by the kmer attribute (default = 11) is used for matching.

levenshtein_distance_1_targets: Number of matches of the Target Sequence to the provided recipient genome with a Levenshtein distance of 1. Note, only the first X bp specified by the kmer attribute (default = 11) is used for matching. Indels are scored as a distance of 2. SNPs are scored as a distance of 1.

levenshtein_distance_2_targets: Number of matches of the Target Sequence to the provided recipient genome with a Levenshtein distance of 2. Note, only the first X bp specified by the kmer attribute (default = 11) is used for matching. Indels are scored as a distance of 2. SNPs are scored as a distance of 1.

bridge_sequence: Full sequence of the bridgeRNA

p6p7_warning: Warning if this bridgeRNA violates preferred handshake rules.

RNA_structural_similarity: The predicted MFE secondary structure of designed bridgeRNA is compared to the reference native IS621 secondary structure using Vienna RNA's RNAforester forest-based structural aligner. A similarity score from 0 to 1 is provided. This can be used to assess whether this bridgeRNA is likely to misfold.

eblock_seq: The bridgeRNA converted into a DNA fragment for synthesis, which can be cloned into the bsaI sites in Patel et al. IS110 vectors.

References

Bridge RNAs direct programmable recombination of target and donor DNA
Arc Institure Bridge RNA Design Tool

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bridgeevaluator-2.0.1.tar.gz (25.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bridgeevaluator-2.0.1-py3-none-any.whl (15.4 kB view details)

Uploaded Python 3

File details

Details for the file bridgeevaluator-2.0.1.tar.gz.

File metadata

  • Download URL: bridgeevaluator-2.0.1.tar.gz
  • Upload date:
  • Size: 25.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for bridgeevaluator-2.0.1.tar.gz
Algorithm Hash digest
SHA256 cb2857c9fa551989f4b125badd2fb9eb8c2f9a6742b8d468181acfb10e52e361
MD5 a99a361e2ac28e2868ff459e36280763
BLAKE2b-256 107213c9ee81e452aad6a3327603d9b4ff556ed8e7401849c9f6195db35a0da7

See more details on using hashes here.

File details

Details for the file bridgeevaluator-2.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for bridgeevaluator-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 aa6132fa211ed8bc5aa433eda434ef9a60083d1627f91fe6e51bd6687fbaaa3a
MD5 3674b185fd1a4edf239ab24647ed63e4
BLAKE2b-256 57c8ab6cada72289a7c809bad2e2027aa259d1de1a8f12806b8f33393fbd322b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page