Skip to main content

Python library to design sgRNA oligos

Project description

sgrna_designer

Python library to design sgRNAs for CRISPR tiling screens

The primary function of this package is design_sgrna_tiling_library, in which you can input a list of ensembl transcript IDs, specify a region of interest (e.g. three_prime_UTR) and get all sgRNAs tiling those transcript regions.

Install

pip install git+https://github.com/gpp-rnd/sgrna_designer.git#egg=sgrna_designer

An example

In this example we'll design sgRNAs tiling the 3' UTR of PDL1 (CD274) and BRAF

Note: You must also have pandas installed to run this tutorial

from sgrna_designer.design import design_sgrna_tiling_library

target_transcripts = ['ENST00000381577', 'ENST00000644969'] # [PDL1, BRAF]

Note the design function is agnostic to CRISPR enzyme and pam preferences, so you must specifiy the following parameters in a design run:

  • region: broad region you are trying to target (e.g. UTR)
  • region: more specific region you are trying to target (e.g. three_prime_UTR)
  • expand_3prime: amount to expand region in 3' direction
  • expand_5prime: amount to expand region in 5' direction
  • context_len: length of context sequence
  • pam_start: position of PAM start relative to the context sequence
  • pam_len: length of PAM
  • sgrna_start: position of sgRNA relative to context sequence
  • sgrna_len: length of sgRNA sequence
  • pams: PAMs to target
  • sg_positions: positions within the sgRNA to annotate and target (e.g. [4,8] for nucleotides 4 and 8 of the sgRNA for a base editing window)
sgrna_designs = design_sgrna_tiling_library(target_transcripts, region_parent='UTR',
                                            region='three_prime_UTR', expand_3prime=30,
                                            expand_5prime=30, context_len=30, pam_start=-6,
                                            pam_len=3, sgrna_start=4, sgrna_len=20,
                                            pams=['AGG', 'CGG', 'TGG', 'GGG'],
                                            sg_positions=[4, 8], flag_seqs=['TTTT', 'CGTCTC', 'GAGACG'],
                                            flag_seqs_start=['TCTC', 'AGACG'], flag_seqs_end=['GAGAC'])
sgrna_designs
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
context_sequence pam_sequence sgrna_sequence sgrna_global_start sgrna_global_4 sgrna_global_8 sgrna_strand object_type transcript_strand transcript_id chromosome region_id region_start region_end
0 CATTGGAACTTCTGATCTTCAAGCAGGGAT AGG GGAACTTCTGATCTTCAAGC 5467872 5467875 5467879 1 three_prime_UTR 1 ENST00000381577 9 ENST00000381577 5467863 5470554
1 ATTGGAACTTCTGATCTTCAAGCAGGGATT GGG GAACTTCTGATCTTCAAGCA 5467873 5467876 5467880 1 three_prime_UTR 1 ENST00000381577 9 ENST00000381577 5467863 5470554
2 CTTCAAGCAGGGATTCTCAACCTGTGGTTT TGG AAGCAGGGATTCTCAACCTG 5467888 5467891 5467895 1 three_prime_UTR 1 ENST00000381577 9 ENST00000381577 5467863 5470554
3 GCAGGGATTCTCAACCTGTGGTTTAGGGGT AGG GGATTCTCAACCTGTGGTTT 5467894 5467897 5467901 1 three_prime_UTR 1 ENST00000381577 9 ENST00000381577 5467863 5470554
4 CAGGGATTCTCAACCTGTGGTTTAGGGGTT GGG GATTCTCAACCTGTGGTTTA 5467895 5467898 5467902 1 three_prime_UTR 1 ENST00000381577 9 ENST00000381577 5467863 5470554
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
845 GCTCAGGTCCCTTCATTTGTACTTTGGAGT TGG AGGTCCCTTCATTTGTACTT 140719570 140719567 140719563 -1 three_prime_UTR -1 ENST00000644969 7 ENST00000644969 140719337 140726493
846 TATAACAGAAAATATTGTTCAGTTTGGATA TGG ACAGAAAATATTGTTCAGTT 140719522 140719519 140719515 -1 three_prime_UTR -1 ENST00000644969 7 ENST00000644969 140719337 140726493
847 ATTGTTCAGTTTGGATAGAAAGCATGGAGA TGG TTCAGTTTGGATAGAAAGCA 140719509 140719506 140719502 -1 three_prime_UTR -1 ENST00000644969 7 ENST00000644969 140719337 140726493
848 TATTTAAAAACTGTATTATATAAAAGGCAA AGG TAAAAACTGTATTATATAAA 140719426 140719423 140719419 -1 three_prime_UTR -1 ENST00000644969 7 ENST00000644969 140719337 140726493
849 CTGCTATAATAAAGATTGACTGCATGGAGA TGG TATAATAAAGATTGACTGCA 140719360 140719357 140719353 -1 three_prime_UTR -1 ENST00000644969 7 ENST00000644969 140719337 140726493

850 rows × 14 columns

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sgrna_designer-0.0.2.tar.gz (15.7 kB view hashes)

Uploaded Source

Built Distribution

sgrna_designer-0.0.2-py3-none-any.whl (14.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page