Skip to main content

This package designs bridgeRNAs targeting a given locus and and provides metrics to evaluate their efficiencies and specificities

Project description

BridgeEvaluator

This package designs bridgeRNAs targeting a given locus and and provides metrics to evaluate their efficiencies and specificities

Author: Jaymin Patel: jayman1466@gmail.com

Installation

Streamlined Installation

You can install this package via pip:

pip install BridgeEvaluator

You must also install the Vienna RNA Suite if you want to score the predicted folding of the designed bRNAs. If you don't have this installed, set score_structure=False in the design_bridges() command.

If you are getting a package not found error, this can usually be fixed by updating your python and pip versions. It may make sense to do this within a new conda environment.

Manual Installation

Alternatively, for manual installation, you can place the files from the "src/BridgeEvaluator/" directory directly into your working directory. If you use this manual installation, make sure you have the following dependencies installed with a python version >=3.9:

biopython >= 1.85

Levenshtein >= 0.27.1

viennarna >= 2.7.0

pandas >= 2.2.0

Usage

The simplest usage within a python script is as follows:

from bridge_evaluator import design_bridges

#Specify the locus that will be scanned to design bRNAs
target_locus = "ATGAGCAAAGGAGAAGAACTTTTCACTGGAGTTGTCCCAATTCTTGTTGAATTAGATGGTGATGTTAATGGGCACAAATTTTCTGTCCGTGGAGAGGGTGAAGGTGATGCTACAAACGGAAAACTCACCCTTAAATTTATTTGCACTACTGGAAAACTACCTGTTCCGTGGCCAACACTTGTCACTACTCTGACCTATGGTGTTCAATGCTTTTCCCGTTATCCGGATCACATGAAACGGCATGACTTTTTCAAGAGTGCCATGCCCGAAGGTTATGTACAGGAACGCACTATATCTTTCAAAGATGACGGGACCTACAAGACGCGTGCTGAAGTCAAGTTTGAAGGTGATACCCTTGTTAATCGTATCGAGTTAAAGGGTATTGATTTTAAAGAAGATGGAAACATTCTTGGACACAAACTCGAGTACAACTTTAACTCACACAATGTATACATCACGGCAGACAAACAAAAGAATGGAATCAAAGCTAACTTCAAAATTCGCCACAACGTTGAAGATGGTTCCGTTCAACTAGCAGACCATTATCAACAAAATACTCCAATTGGCGATGGCCCTGTCCTTTTACCAGACAACCATTACCTGTCGACACAATCTGTCCTTTCGAAAGATCCCAACGAAAAGCGTGACCACATGGTCCTTCTTGAGTTTGTAACTGCTGCTGGGATTACACATGGCATGGATGAGCTCTACAAAtaa"

#Name of this locus. The results will be outputted with this filename
target_name = "sfGFP"

#Genbank file of the full genome of the target. This will be used to evaluate possible off targets 
genbank_file = "MG1655.gb"

design_bridges(target_locus, target_name, genbank_file)

Required Parameters

target_locus: Nucleotide sequence of the locus you want to target. This script will scan this locus to identify all permissive 14mer target sites and provide attributes to score their predicted efficiency and specificity.

target_name: Name of the locus you are targeting. This will be used to name the designed bridgeRNAs and the output file of the script. bridgeRNA names are given the syntax: bridge_IS621_T_{target_name}_{index}D{donor_name}. The {target_name} and {donor_name} are pulled from the arguments of this function.

genbank_file: Genbank file of the recipient genome. This will be used to identify potential off target sites for each designed bridgeRNA.

Optional Parameters

donor_seq: Sequence of the Donor Sequence (14mer) being used. Default is the native IS621 donor "ACAGTATCTTGTAT"

donor_name: Name of the Donor. This will be used to name the designed bridgeRNAs. Default is "1"

cores: Core sequences that can be used, provided as a list. Note, the core of the provided donor sequence will be modified to match the core of the target sequence for each designed bridgeRNA. Default is ['CT']

kmer: The first X bp of the target sequence that will be used to identify perfect and imperfect offtargets in the recipient genome. Default is 11

avoid_restriction: Restriction sites (and other sequences) to avoid in the designed bridgeRNAs, provided as a list. Reverse complements must be provided manually. Default is []

check_imperfect: Should this script look for imperfect offtarget sequences in the recipient genome? This increases computational time dramatically. All imperfect offtargets with a Levenshtein distance <= 2 are tabulated. Indels are given a Levenshtein score of 2 and mismatches are given a Levenshtein score of 1. Default is True

score_structure: Should this script evaluate the predicted secondary structure of the designed bridgeRNAs? Predicted MFE structures of designed bridgeRNAs are compared to the reference native IS621 secondary structure using Vienna RNA's RNAforester forest-based structural aligner. A similarity score from 0 to 1 is provided. Default is True

feature_type: For internal metadata, you can include a feature type (eg CDS, ncRNA) for the locus you are targeting. Default is ""

primer_seqs: In addition to full bridgeRNA sequences, this script converts these sequences into DNA fragments for synthesis which can be cloned into the bsaI sites in Patel et al. IS110 vectors. If you'd like to include PCR priming sites to amplify these fragments, you can specify them as a dictionary of lists, segmented by core sequence, as follows: {'CT': ["for_primer_seq1", "rev_primer_seq1"], 'GT': ["for_primer_seq2", "rev_primer_seq1"]}. Default is {"CT": ["",""], "GT": ["",""], "AT": ["",""], "TT": ["",""]}

Example python script utilizing some optional parameters:

from bridge_evaluator import design_bridges

#Specify the locus that will be scanned to design bRNAs
target_locus = "ATGAGCAAAGGAGAAGAACTTTTCACTGGAGTTGTCCCAATTCTTGTTGAATTAGATGGTGATGTTAATGGGCACAAATTTTCTGTCCGTGGAGAGGGTGAAGGTGATGCTACAAACGGAAAACTCACCCTTAAATTTATTTGCACTACTGGAAAACTACCTGTTCCGTGGCCAACACTTGTCACTACTCTGACCTATGGTGTTCAATGCTTTTCCCGTTATCCGGATCACATGAAACGGCATGACTTTTTCAAGAGTGCCATGCCCGAAGGTTATGTACAGGAACGCACTATATCTTTCAAAGATGACGGGACCTACAAGACGCGTGCTGAAGTCAAGTTTGAAGGTGATACCCTTGTTAATCGTATCGAGTTAAAGGGTATTGATTTTAAAGAAGATGGAAACATTCTTGGACACAAACTCGAGTACAACTTTAACTCACACAATGTATACATCACGGCAGACAAACAAAAGAATGGAATCAAAGCTAACTTCAAAATTCGCCACAACGTTGAAGATGGTTCCGTTCAACTAGCAGACCATTATCAACAAAATACTCCAATTGGCGATGGCCCTGTCCTTTTACCAGACAACCATTACCTGTCGACACAATCTGTCCTTTCGAAAGATCCCAACGAAAAGCGTGACCACATGGTCCTTCTTGAGTTTGTAACTGCTGCTGGGATTACACATGGCATGGATGAGCTCTACAAAtaa"

#Name of this locus. The results will be outputted with this filename
target_name = "sfGFP"

#Genbank file of the full genome of the target. This will be used to evaluate possible off targets 
genbank_file = "MG1655.gb"

#Donor A is the native Donor of IS621
donor_seq = 'ACAGTATCTTGTAT' #donor A

#Cores to use
cores = ["CT","GT"]

#avoid the following sequences - manually include the reverse complement
avoid_restriction = ["GGTCTC","GAGACC","GCTCTTC","GAAGAGC"]

design_bridges(target_locus, target_name, genbank_file, donor_seq = donor_seq, cores = cores, avoid_restriction = avoid_restriction)

Output

The results are exported in the present directory as a csv file named {target_name}.csv. Each row represents a potential bridgeRNA that can target the provided locus

Output Columns:

target_gene: Name of locus being targeted

feature_type: Feature type of locus being targeted

donor_seq: Donor Sequence (14mer)

target_seq: Target Sequence (14mer)

index: Position of the Target Sequence relative to target locus

strand: Orientation of the Target Sequence (+ or -) relative to the target locus

core: Core sequence being used

perfect_match_targets: Number of perfect matches of the Target Sequence to the provided recipient genome. Note, only the first X bp specified by the kmer attribute (default = 11) is used for matching.

levenshtein_distance_1_targets: Number of matches of the Target Sequence to the provided recipient genome with a Levenshtein distance of 1. Note, only the first X bp specified by the kmer attribute (default = 11) is used for matching. Indels are scored as a distance of 2. SNPs are scored as a distance of 1.

levenshtein_distance_2_targets: Number of matches of the Target Sequence to the provided recipient genome with a Levenshtein distance of 2. Note, only the first X bp specified by the kmer attribute (default = 11) is used for matching. Indels are scored as a distance of 2. SNPs are scored as a distance of 1.

bridge_sequence: Full sequence of the bridgeRNA

p6p7_warning: Warning if this bridgeRNA violates preferred handshake rules.

RNA_structural_similarity: The predicted MFE secondary structure of designed bridgeRNA is compared to the reference native IS621 secondary structure using Vienna RNA's RNAforester forest-based structural aligner. A similarity score from 0 to 1 is provided. This can be used to assess whether this bridgeRNA is likely to misfold.

eblock_seq: The bridgeRNA converted into a DNA fragment for synthesis, which can be cloned into the bsaI sites in Patel et al. IS110 vectors.

References

Bridge RNAs direct programmable recombination of target and donor DNA
Arc Institure Bridge RNA Design Tool

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bridgeevaluator-2.0.3.tar.gz (25.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bridgeevaluator-2.0.3-py3-none-any.whl (15.4 kB view details)

Uploaded Python 3

File details

Details for the file bridgeevaluator-2.0.3.tar.gz.

File metadata

  • Download URL: bridgeevaluator-2.0.3.tar.gz
  • Upload date:
  • Size: 25.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for bridgeevaluator-2.0.3.tar.gz
Algorithm Hash digest
SHA256 ea7f873ebfd697665e581a2d235e326686c8f29bc159cf00d7a62ff04741a685
MD5 1506415b4b608a89e72f8c7d08106480
BLAKE2b-256 a890ff60fb237fa05a8173480645f44c9db3da4d93f4bf3759a963f52b75ebed

See more details on using hashes here.

File details

Details for the file bridgeevaluator-2.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for bridgeevaluator-2.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 f35a8975ff140eee3c44d5c6fae335b408da69232c9071bba2e43c8b8a01f66e
MD5 8851dc53d99bb39e844a5b3b5aaae88f
BLAKE2b-256 5f265cdae9b276e055412dc52915e655098fde595622ae8c22c58256496c26d6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page