Skip to main content

This package designs bridgeRNAs targeting a given locus and and provides metrics to evaluate their efficiencies and specificities

Project description

BridgeEvaluator

This package designs bridgeRNAs targeting a given locus and and provides metrics to evaluate their efficiencies and specificities

Author: Jaymin Patel: jayman1466@gmail.com

Installation

Streamlined Installation

You can install this package via pip:

pip install BridgeEvaluator

You must also install the Vienna RNA Suite if you want to score the predicted folding of the designed bRNAs. If you don't have this installed, set score_structure=False in the design_bridges() command.

Manual Installation

Alternatively, for manual installation, you can place the files from the "src/BridgeEvaluator/" directory directly into your working directory. If you use this manual installation, make sure you have the following dependencies installed with a python version >=3.9:

biopython >= 1.85

Levenshtein >= 0.27.1

viennarna >= 2.7.0

pandas >= 2.2.0

Usage

The simplest usage within a python script is as follows:

from bridge_evaluator import design_bridges

#Specify the locus that will be scanned to design bRNAs
target_locus = "ATGAGCAAAGGAGAAGAACTTTTCACTGGAGTTGTCCCAATTCTTGTTGAATTAGATGGTGATGTTAATGGGCACAAATTTTCTGTCCGTGGAGAGGGTGAAGGTGATGCTACAAACGGAAAACTCACCCTTAAATTTATTTGCACTACTGGAAAACTACCTGTTCCGTGGCCAACACTTGTCACTACTCTGACCTATGGTGTTCAATGCTTTTCCCGTTATCCGGATCACATGAAACGGCATGACTTTTTCAAGAGTGCCATGCCCGAAGGTTATGTACAGGAACGCACTATATCTTTCAAAGATGACGGGACCTACAAGACGCGTGCTGAAGTCAAGTTTGAAGGTGATACCCTTGTTAATCGTATCGAGTTAAAGGGTATTGATTTTAAAGAAGATGGAAACATTCTTGGACACAAACTCGAGTACAACTTTAACTCACACAATGTATACATCACGGCAGACAAACAAAAGAATGGAATCAAAGCTAACTTCAAAATTCGCCACAACGTTGAAGATGGTTCCGTTCAACTAGCAGACCATTATCAACAAAATACTCCAATTGGCGATGGCCCTGTCCTTTTACCAGACAACCATTACCTGTCGACACAATCTGTCCTTTCGAAAGATCCCAACGAAAAGCGTGACCACATGGTCCTTCTTGAGTTTGTAACTGCTGCTGGGATTACACATGGCATGGATGAGCTCTACAAAtaa"

#Name of this locus. The results will be outputted with this filename
target_name = "sfGFP"

#Genbank file of the full genome of the target. This will be used to evaluate possible off targets 
genbank_file = "MG1655.gb"

design_bridges(target_locus, target_name, genbank_file)

Required Parameters

target_locus: Nucleotide sequence of the locus you want to target. This script will scan this locus to identify all permissive 14mer target sites and provide attributes to score their predicted efficiency and specificity.

target_name: Name of the locus you are targeting. This will be used to name the designed bridgeRNAs and the output file of the script. bridgeRNA names are given the syntax: bridge_IS621_T_{target_name}_{index}D{donor_name}. The {target_name} and {donor_name} are pulled from the arguments of this function.

genbank_file: Genbank file of the recipient genome. This will be used to identify potential off target sites for each designed bridgeRNA.

Optional Parameters

donor_seq: Sequence of the Donor Sequence (14mer) being used. Default is the native IS621 donor "ACAGTATCTTGTAT"

donor_name: Name of the Donor. This will be used to name the designed bridgeRNAs. Default is "1"

cores: Core sequences that can be used, provided as a list. Note, the core of the provided donor sequence will be modified to match the core of the target sequence for each designed bridgeRNA. Default is ['CT']

kmer: The first X bp of the target sequence that will be used to identify perfect and imperfect offtargets in the recipient genome. Default is 11

avoid_restriction: Restriction sites (and other sequences) to avoid in the designed bridgeRNAs, provided as a list. Reverse complements must be provided manually. Default is []

check_imperfect: Should this script look for imperfect offtarget sequences in the recipient genome? This increases computational time dramatically. All imperfect offtargets with a Levenshtein distance <= 2 are tabulated. Indels are given a Levenshtein score of 2 and mismatches are given a Levenshtein score of 1. Default is True

score_structure: Should this script evaluate the predicted secondary structure of the designed bridgeRNAs? Predicted MFE structures of designed bridgeRNAs are compared to the reference native IS621 secondary structure using Vienna RNA's RNAforester forest-based structural aligner. A similarity score from 0 to 1 is provided. Default is True

feature_type: For internal metadata, you can include a feature type (eg CDS, ncRNA) for the locus you are targeting. Default is ""

primer_seqs: In addition to full bridgeRNA sequences, this script converts these sequences into DNA fragments for synthesis which can be cloned into the bsaI sites in Patel et al. IS110 vectors. If you'd like to include PCR priming sites to amplify these fragments, you can specify them as a dictionary of lists, segmented by core sequence, as follows: {'CT': ["for_primer_seq1", "rev_primer_seq1"], 'GT': ["for_primer_seq2", "rev_primer_seq1"]}. Default is {"CT": ["",""], "GT": ["",""], "AT": ["",""], "TT": ["",""]}

Example python script utilizing some optional parameters:

from bridge_evaluator import design_bridges

#Specify the locus that will be scanned to design bRNAs
target_locus = "ATGAGCAAAGGAGAAGAACTTTTCACTGGAGTTGTCCCAATTCTTGTTGAATTAGATGGTGATGTTAATGGGCACAAATTTTCTGTCCGTGGAGAGGGTGAAGGTGATGCTACAAACGGAAAACTCACCCTTAAATTTATTTGCACTACTGGAAAACTACCTGTTCCGTGGCCAACACTTGTCACTACTCTGACCTATGGTGTTCAATGCTTTTCCCGTTATCCGGATCACATGAAACGGCATGACTTTTTCAAGAGTGCCATGCCCGAAGGTTATGTACAGGAACGCACTATATCTTTCAAAGATGACGGGACCTACAAGACGCGTGCTGAAGTCAAGTTTGAAGGTGATACCCTTGTTAATCGTATCGAGTTAAAGGGTATTGATTTTAAAGAAGATGGAAACATTCTTGGACACAAACTCGAGTACAACTTTAACTCACACAATGTATACATCACGGCAGACAAACAAAAGAATGGAATCAAAGCTAACTTCAAAATTCGCCACAACGTTGAAGATGGTTCCGTTCAACTAGCAGACCATTATCAACAAAATACTCCAATTGGCGATGGCCCTGTCCTTTTACCAGACAACCATTACCTGTCGACACAATCTGTCCTTTCGAAAGATCCCAACGAAAAGCGTGACCACATGGTCCTTCTTGAGTTTGTAACTGCTGCTGGGATTACACATGGCATGGATGAGCTCTACAAAtaa"

#Name of this locus. The results will be outputted with this filename
target_name = "sfGFP"

#Genbank file of the full genome of the target. This will be used to evaluate possible off targets 
genbank_file = "MG1655.gb"

#Donor A is the native Donor of IS621
donor_seq = 'ACAGTATCTTGTAT' #donor A

#Cores to use
cores = ["CT","GT"]

#avoid the following sequences - manually include the reverse complement
avoid_restriction = ["GGTCTC","GAGACC","GCTCTTC","GAAGAGC"]

design_bridges(target_locus, target_name, genbank_file, donor_seq = donor_seq, cores = cores, avoid_restriction = avoid_restriction)

Output

The results are exported in the present directory as a csv file named {target_name}.csv. Each row represents a potential bridgeRNA that can target the provided locus

Output Columns:

target_gene: Name of locus being targeted

feature_type: Feature type of locus being targeted

donor_seq: Donor Sequence (14mer)

target_seq: Target Sequence (14mer)

index: Position of the Target Sequence relative to target locus

strand: Orientation of the Target Sequence (+ or -) relative to the target locus

core: Core sequence being used

perfect_match_targets: Number of perfect matches of the Target Sequence present in the provided recipient genome. Note, only the first X bp specified by the kmer attribute (default = 11) is used for matching.

levenshtein_distance_1_targets: Number of matches of the Target Sequence present in the provided recipient genome with a Levenshtein distance of 1. Note, only the first X bp specified by the kmer attribute (default = 11) is used for matching. Indels are scored as a distance of 2. SNPs are scored as a distance of 1.

levenshtein_distance_2_targets: Number of matches of the Target Sequence present in the provided recipient genome with a Levenshtein distance of 2. Note, only the first X bp specified by the kmer attribute (default = 11) is used for matching. Indels are scored as a distance of 2. SNPs are scored as a distance of 1.

bridge_sequence: Full sequence of the bridgeRNA

p6p7_warning: Warning if this bridgeRNA violates preferred handshake rules.

RNA_structural_similarity: The predicted MFE secondary structure of designed bridgeRNA is compared to the reference native IS621 secondary structure using Vienna RNA's RNAforester forest-based structural aligner. A similarity score from 0 to 1 is provided. This can be used to assess whether this bridgeRNA is likely to misfold.

eblock_seq: The bridgeRNA converted into a DNA fragment for synthesis, which can be cloned into the bsaI sites in Patel et al. IS110 vectors.

References

Bridge RNAs direct programmable recombination of target and donor DNA
Arc Institure Bridge RNA Design Tool

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bridgeevaluator-1.0.0.tar.gz (28.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bridgeevaluator-1.0.0-py3-none-any.whl (10.7 kB view details)

Uploaded Python 3

File details

Details for the file bridgeevaluator-1.0.0.tar.gz.

File metadata

  • Download URL: bridgeevaluator-1.0.0.tar.gz
  • Upload date:
  • Size: 28.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for bridgeevaluator-1.0.0.tar.gz
Algorithm Hash digest
SHA256 e5ec058da8c1a21b86f322d4a10cc0a6164e311436cacd63022d86d389a007ce
MD5 6de6e7ed23ebd7085b372cbae219810c
BLAKE2b-256 25fb1fdbeb33b61322eaf538ddfb752e6117cd019660a9d10200d982f3cb086a

See more details on using hashes here.

File details

Details for the file bridgeevaluator-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for bridgeevaluator-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cb39e0ed0b6e85bc5e5e49a1194089c0f82e310df6bb53def637da3983098105
MD5 baad5ae207cab0f864670eb813c70791
BLAKE2b-256 9463eb6c7f5d2ccd7235f5d977695918cce316e85b55e5133909f99d6b7f2b03

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page