Tools to design and analyse CRISPRi experiments
Project description
CRISPRbact
Tools to design and analyse CRISPRi experiments in bacteria.
CRISPRbact currently contains an on-target activity prediction tool for the Streptococcus pyogenes dCas9 protein. This tool takes as an input the sequence of a gene of interest and returns a list of possible target sequences with the predicted on-target activity. Predictions are made using a linear model fitted on data from a genome-wide CRISPRi screen performed in E. coli (Cui et al. Nature Communications, 2018). The model predicts the ability of dCas9 to block the RNA polymerase when targeting the non-template strand (i.e. the coding strand) of a target gene.
Getting Started
Installation
For the moment, you can install this package only via PyPI
PyPI
$ pip install crisprbact
$ crisprbact --help
Usage: crisprbact [OPTIONS] COMMAND [ARGS]...
Options:
-v, --verbose
--help Show this message and exit.
Commands:
predict
API
Using this library in your python code.
from crisprbact import on_target_predict
guide_rnas = on_target_predict("ACCACTGGCGTGCGCGTTACTCATCAGATGCTGTTCAATACCGATCAGGTTATCGAAGTGTTTGTGATTGTTTGCCGCGCGCGTGGCGAAGGCCCGTGATGAAGGAAAAGTTTTGCGCTATGTTGGCAATATTGATGAAG")
for guide_rna in guide_rnas:
print(guide_rna)
output :
{'target': 'TCATCACGGGCCTTCGCCACGCGCG', 'guide': 'TCATCACGGGCCTTCGCCAC', 'start': 82, 'stop': 102, 'pam': 80, 'ori': '-', 'target_id': 1, 'pred': -0.4719254873780802, 'off_targets_per_seed': []}
{'target': 'CATCACGGGCCTTCGCCACGCGCGC', 'guide': 'CATCACGGGCCTTCGCCACG', 'start': 81, 'stop': 101, 'pam': 79, 'ori': '-', 'target_id': 2, 'pred': 1.0491308060379676, 'off_targets_per_seed': []}
{'target': 'CGCGCGCGGCAAACAATCACAAACA', 'guide': 'CGCGCGCGGCAAACAATCAC', 'start': 63, 'stop': 83, 'pam': 61, 'ori': '-', 'target_id': 3, 'pred': -0.9021152826078697, 'off_targets_per_seed': []}
{'target': 'CCTGATCGGTATTGAACAGCATCTG', 'guide': 'CCTGATCGGTATTGAACAGC', 'start': 29, 'stop': 49, 'pam': 27, 'ori': '-', 'target_id': 4, 'pred': 0.23853258873311955, 'off_targets_per_seed': []}
Command line interface
Predict guide RNAs activity
Input the sequence of a target gene and this script will output candidate guide RNAs for the S. pyogenes dCas9 with predicted on-target activity.
$ crisprbact predict --help
Usage: crisprbact predict [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
from-seq Outputs candidate guide RNAs for the S.
from-str Outputs candidate guide RNAs for the S.
From a string sequence:
The target input sequence can be a simple string.
$ crisprbact predict from-str --help
Usage: cli.py predict from-str [OPTIONS] [OUTPUT_FILE]
Outputs candidate guide RNAs for the S. pyogenes dCas9 with predicted on-
target activity from a target gene.
[OUTPUT_FILE] file where the candidate guide RNAs are saved. Default =
"stdout"
Options:
-t, --target TEXT Sequence file to target [required]
-s, --off-target-sequence FILENAME
Sequence in which you want to find off-
targets
-w, --off-target-sequence-format [fasta|gb|genbank]
Sequence in which you want to find off-
targets format [default: genbank]
--help Show this message and exit.
$ crisprbact predict from-str -t ACCACTGGCGTGCGCGTTACTCATCAGATGCTGTTCAATACCGATCAGGTTATCGAAGTGTTTGTGATTGTTTGCCGCGCGCGTGGCGAAGGCCCGTGATGAAGGAAAAGTTTTGCGCTATGTTGGCAATATTGATGAAG guide-rnas.tsv
output file guide-rnas.tsv
:
No seq_id
is defined since it is from a simple string.
target PAM position prediction seq_id
TCATCACGGGCCTTCGCCACGCGCG 80 -0.4719254873780802 N/A
CATCACGGGCCTTCGCCACGCGCGC 79 1.0491308060379676 N/A
CGCGCGCGGCAAACAATCACAAACA 61 -0.9021152826078697 N/A
CCTGATCGGTATTGAACAGCATCTG 27 0.23853258873311955 N/A
You can also pipe the results :
$ crisprbact predict from-str -t ACCACTGGCGTGCGCGTTACTCATCAGATGCTGTTCAATACCGATCAGGTTATCGAAGTGTTTGTGATTGTTTGCCGCGCGCGTGGCGAAGGCCCGTGATGAAGGAAAAGTTTTGCGCTATGTTGGCAATATTGATGAAG | tail -n +2 | wc -l
From a sequence file
$ crisprbact predict from-seq --help
Usage: cli.py predict from-seq [OPTIONS] [OUTPUT_FILE]
Outputs candidate guide RNAs for the S. pyogenes dCas9 with predicted on-
target activity from a target gene.
[OUTPUT_FILE] file where the candidate guide RNAs are saved. Default =
"stdout"
Options:
-t, --target FILENAME Sequence file to target [required]
-f, --seq-format [fasta|gb|genbank]
Sequence file to target format [default:
fasta]
-s, --off-target-sequence FILENAME
Sequence in which you want to find off-
targets
-w, --off-target-sequence-format [fasta|gb|genbank]
Sequence in which you want to find off-
targets format [default: genbank]
--help Show this message and exit.
- Fasta file (could be a multifasta file)
$ crisprbact predict from-seq -t /tmp/seq.fasta guide-rnas.tsv
- GenBank file
$ crisprbact predict from-seq -t /tmp/seq.gb -f gb guide-rnas.tsv
- Off-targets
predict from-seq -t data-test/sequence.fasta -s data-test/sequence.gb guide-rnas.tsv
Output file
target_id target PAM position prediction target_seq_id seed_size off_target_recid off_target_start off_target_end off_target_pampos off_target_strand off_target_feat_type off_target_feat_start off_target_feat_end off_target_feat_strand off_target_locus_tag off_target_gene off_target_note off_target_product off_target_protein_id
1 TGATCCAGGCATTTTTTAGCTTCAT 835 0.47949500169043713 NC_017634.1:2547433-2548329 8 NC_017634.1 1388198 1388209 1388209 +
1 TGATCCAGGCATTTTTTAGCTTCAT 835 0.47949500169043713 NC_017634.1:2547433-2548329 8 NC_017634.1 2244514 2244525 2244525 + CDS 2243562 2244720 -1 NRG857_10810 COG1174 ABC-type proline/glycine betaine transport systems, permease component putative transport system permease YP_006120510.1
1 TGATCCAGGCATTTTTTAGCTTCAT 835 0.47949500169043713 NC_017634.1:2547433-2548329 8 NC_017634.1 4160984 4160995 4160995 + CDS 4160074 4161406 -1 NRG857_19625 hslU COG1220 ATP-dependent protease HslVU (ClpYQ), ATPase subunit ATP-dependent protease ATP-binding subunit HslU YP_006122267.1
1 TGATCCAGGCATTTTTTAGCTTCAT 835 0.47949500169043713 NC_017634.1:2547433-2548329 8 NC_017634.1 4534189 4534200 4534200 +
1 TGATCCAGGCATTTTTTAGCTTCAT 835 0.47949500169043713 NC_017634.1:2547433-2548329 8 NC_017634.1 548804 548815 548804 -
1 TGATCCAGGCATTTTTTAGCTTCAT 835 0.47949500169043713 NC_017634.1:2547433-2548329 8 NC_017634.1 786462 786473 786462 - CDS 785384 786470 1 NRG857_03580 COG2055 Malate/L-lactate dehydrogenases hypothetical protein YP_006119079.1
Contributing
Clone repo
$ git clone https://gitlab.pasteur.fr/dbikard/crisprbact.git
Create a virtualenv
$ virtualenv -p python3.7 .venv
$ . .venv/bin/activate
$ pip install poetry
Install crisprbact dependencies
$ poetry install
Install hooks
In order to run flake8 and black for each commit.
$ pre-commit install
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for crisprbact-0.3.11-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f49989694b71275aa7ec82d51b7ecbf6b1f3a4e197a8dafdf1c7ee40dffc1705 |
|
MD5 | 1afb6ac95fa751950d548c9759695c62 |
|
BLAKE2b-256 | 04788500a81a74770803f615f6252b450bb544841e682a555550ce35db35c572 |