Skip to main content

Tools to design and analyse CRISPRi experiments

Project description

CRISPRbact

Tools to design and analyse CRISPRi experiments in bacteria.

CRISPRbact currently contains an on-target activity prediction tool for the Streptococcus pyogenes dCas9 protein. This tool takes as an input the sequence of a gene of interest and returns a list of possible target sequences with the predicted on-target activity. Predictions are made using a linear model fitted on data from a genome-wide CRISPRi screen performed in E. coli (Cui et al. Nature Communications, 2018). The model predicts the ability of dCas9 to block the RNA polymerase when targeting the non-template strand (i.e. the coding strand) of a target gene.

Getting Started

Installation

For the moment, you can install this package only via PyPI

PyPI

$ pip install crisprbact
$ crisprbact --help
Usage: crisprbact [OPTIONS] COMMAND [ARGS]...

Options:
  -v, --verbose
  --help         Show this message and exit.

Commands:
  predict

API

Using this library in your python code.

from crisprbact import on_target_predict

guide_rnas = on_target_predict("ACCACTGGCGTGCGCGTTACTCATCAGATGCTGTTCAATACCGATCAGGTTATCGAAGTGTTTGTGATTGTTTGCCGCGCGCGTGGCGAAGGCCCGTGATGAAGGAAAAGTTTTGCGCTATGTTGGCAATATTGATGAAG")

for guide_rna in guide_rnas:
    print(guide_rna)

output :

{'target': 'TCATCACGGGCCTTCGCCACGCGCG', 'guide': 'TCATCACGGGCCTTCGCCAC', 'start': 82, 'stop': 102, 'pam': 80, 'ori': '-', 'target_id': 1, 'pred': -0.4719254873780802, 'off_targets_per_seed': []}
{'target': 'CATCACGGGCCTTCGCCACGCGCGC', 'guide': 'CATCACGGGCCTTCGCCACG', 'start': 81, 'stop': 101, 'pam': 79, 'ori': '-', 'target_id': 2, 'pred': 1.0491308060379676, 'off_targets_per_seed': []}
{'target': 'CGCGCGCGGCAAACAATCACAAACA', 'guide': 'CGCGCGCGGCAAACAATCAC', 'start': 63, 'stop': 83, 'pam': 61, 'ori': '-', 'target_id': 3, 'pred': -0.9021152826078697, 'off_targets_per_seed': []}
{'target': 'CCTGATCGGTATTGAACAGCATCTG', 'guide': 'CCTGATCGGTATTGAACAGC', 'start': 29, 'stop': 49, 'pam': 27, 'ori': '-', 'target_id': 4, 'pred': 0.23853258873311955, 'off_targets_per_seed': []}

Command line interface

Predict guide RNAs activity

Input the sequence of a target gene and this script will output candidate guide RNAs for the S. pyogenes dCas9 with predicted on-target activity.

$ crisprbact predict --help
Usage: crisprbact predict [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  from-seq  Outputs candidate guide RNAs for the S.
  from-str  Outputs candidate guide RNAs for the S.
From a string sequence:

The target input sequence can be a simple string.

$ crisprbact predict from-str --help
Usage: cli.py predict from-str [OPTIONS] [OUTPUT_FILE]

  Outputs candidate guide RNAs for the S. pyogenes dCas9 with predicted on-
  target activity from a target gene.

  [OUTPUT_FILE] file where the candidate guide RNAs are saved. Default =
  "stdout"

Options:
  -t, --target TEXT               Sequence file to target  [required]
  -s, --off-target-sequence FILENAME
                                  Sequence in which you want to find off-
                                  targets
  -w, --off-target-sequence-format [fasta|gb|genbank]
                                  Sequence in which you want to find off-
                                  targets format  [default: genbank]
  --help                          Show this message and exit.

$ crisprbact predict from-str -t ACCACTGGCGTGCGCGTTACTCATCAGATGCTGTTCAATACCGATCAGGTTATCGAAGTGTTTGTGATTGTTTGCCGCGCGCGTGGCGAAGGCCCGTGATGAAGGAAAAGTTTTGCGCTATGTTGGCAATATTGATGAAG guide-rnas.tsv

output file guide-rnas.tsv :

No seq_id is defined since it is from a simple string.

target	PAM position	prediction	seq_id
TCATCACGGGCCTTCGCCACGCGCG	80	-0.4719254873780802	N/A
CATCACGGGCCTTCGCCACGCGCGC	79	1.0491308060379676	N/A
CGCGCGCGGCAAACAATCACAAACA	61	-0.9021152826078697	N/A
CCTGATCGGTATTGAACAGCATCTG	27	0.23853258873311955	N/A

You can also pipe the results :

$ crisprbact predict from-str -t ACCACTGGCGTGCGCGTTACTCATCAGATGCTGTTCAATACCGATCAGGTTATCGAAGTGTTTGTGATTGTTTGCCGCGCGCGTGGCGAAGGCCCGTGATGAAGGAAAAGTTTTGCGCTATGTTGGCAATATTGATGAAG | tail -n +2 | wc -l
From a sequence file
$ crisprbact predict from-seq --help
Usage: cli.py predict from-seq [OPTIONS] [OUTPUT_FILE]

  Outputs candidate guide RNAs for the S. pyogenes dCas9 with predicted on-
  target activity from a target gene.

  [OUTPUT_FILE] file where the candidate guide RNAs are saved. Default =
  "stdout"

Options:
  -t, --target FILENAME           Sequence file to target  [required]
  -f, --seq-format [fasta|gb|genbank]
                                  Sequence file to target format  [default:
                                  fasta]
  -s, --off-target-sequence FILENAME
                                  Sequence in which you want to find off-
                                  targets
  -w, --off-target-sequence-format [fasta|gb|genbank]
                                  Sequence in which you want to find off-
                                  targets format  [default: genbank]
  --help                          Show this message and exit.
  • Fasta file (could be a multifasta file)
$ crisprbact predict from-seq -t /tmp/seq.fasta guide-rnas.tsv
  • GenBank file
$ crisprbact predict from-seq -t /tmp/seq.gb -f gb guide-rnas.tsv
  • Off-targets
predict from-seq -t data-test/sequence.fasta -s data-test/sequence.gb guide-rnas.tsv
Output file
target_id	target	PAM position	prediction	target_seq_id	seed_size	off_target_recid	off_target_start	off_target_end	off_target_pampos	off_target_strand	off_target_feat_type	off_target_feat_start	off_target_feat_end	off_target_feat_strand	off_target_locus_tag	off_target_gene	off_target_note	off_target_product	off_target_protein_id
1	TGATCCAGGCATTTTTTAGCTTCAT	835	0.47949500169043713	NC_017634.1:2547433-2548329	8	NC_017634.1	1388198	1388209	1388209	+
1	TGATCCAGGCATTTTTTAGCTTCAT	835	0.47949500169043713	NC_017634.1:2547433-2548329	8	NC_017634.1	2244514	2244525	2244525	+	CDS	2243562	2244720	-1	NRG857_10810		COG1174 ABC-type proline/glycine betaine transport systems, permease component	putative transport system permease	YP_006120510.1
1	TGATCCAGGCATTTTTTAGCTTCAT	835	0.47949500169043713	NC_017634.1:2547433-2548329	8	NC_017634.1	4160984	4160995	4160995	+	CDS	4160074	4161406	-1	NRG857_19625	hslU	COG1220 ATP-dependent protease HslVU (ClpYQ), ATPase subunit	ATP-dependent protease ATP-binding subunit HslU	YP_006122267.1
1	TGATCCAGGCATTTTTTAGCTTCAT	835	0.47949500169043713	NC_017634.1:2547433-2548329	8	NC_017634.1	4534189	4534200	4534200	+
1	TGATCCAGGCATTTTTTAGCTTCAT	835	0.47949500169043713	NC_017634.1:2547433-2548329	8	NC_017634.1	548804	548815	548804	-
1	TGATCCAGGCATTTTTTAGCTTCAT	835	0.47949500169043713	NC_017634.1:2547433-2548329	8	NC_017634.1	786462	786473	786462	-	CDS	785384	786470	1	NRG857_03580		COG2055 Malate/L-lactate dehydrogenases	hypothetical protein	YP_006119079.1

Contributing

Clone repo

$ git clone https://gitlab.pasteur.fr/dbikard/crisprbact.git

Create a virtualenv

$ virtualenv -p python3.7 .venv
$ . .venv/bin/activate
$ pip install poetry

Install crisprbact dependencies

$ poetry install

Install hooks

In order to run flake8 and black for each commit.

$ pre-commit install

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crisprbact-0.3.7.tar.gz (24.8 kB view hashes)

Uploaded Source

Built Distribution

crisprbact-0.3.7-py3-none-any.whl (22.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page