CRISPR gRNA design for zebrafish with Ensembl-aware filtering — Python 3 frontend wrapping the CHOPCHOP engine.
Project description
ZebraCHOP
CRISPR guide-RNA design for zebrafish (Danio rerio), with Ensembl-aware post-processing and a clean web UI.
ZebraCHOP is a zebrafish-focused fork of CHOPCHOP. It picks high-scoring Cas9/TALEN/Cpf1/Nickase guides for any zebrafish gene, scores them with the published efficiency models (Doench 2016, Xu 2015, …), maps off-targets with Bowtie, and post-processes the output via Ensembl REST to bias selection toward early exons (more likely to produce loss-of-function).
What it does
Gene name(s)
│
▼
┌────────────────┐ ┌────────────────┐ ┌────────────────┐ ┌────────────────┐
│ Resolve via │ → │ Extract │ → │ Score guides │ → │ Filter via │
│ danRer11.gene │ │ sequence with │ │ + off-targets │ │ Ensembl exon │
│ _table │ │ twoBitToFa │ │ via Bowtie │ │ structure │
└────────────────┘ └────────────────┘ └────────────────┘ └────────────────┘
│ │
└──────────────────────────► Per-gene .tsv (ranked) ◄──────────┘
Inputs: gene name(s) (or a genePred file). Outputs: one TSV per gene with ranked guides, off-target counts, GC%, self-complementarity, efficiency score, etc., plus an optional Ensembl-aware text summary.
Quick start
git clone https://github.com/abachu2005/ZebraCHOP.git
cd ZebraCHOP
python3 bin/zebrachop-setup # interactive: detects bowtie/twoBitToFa/primer3, writes config_local.json
bash frontend/run.sh # open http://127.0.0.1:8000
If you prefer the CLI:
python2 chopchop_query.py --gene_names rx3,tbx16 -o results/ -- -G danRer11
python reformat.py -input results/ -output ranked_summary.txt
Why two Python versions?
The core CHOPCHOP engine (chopchop.py, chopchop_query.py, featurization.py) is Python 2.7 — it's a fork of upstream CHOPCHOP and depends on a frozen scikit-learn pickle (Doench_2016_18.01_model_nopos.pickle) that doesn't survive a clean Py3 port. The new web frontend and setup wizard are Python 3 and shell out to the Py2 engine as a subprocess.
The setup wizard records your Python-2 path under config_local.json["PYTHON2"] so you only have to set it once.
External dependencies
The setup wizard checks for these and helps you install/configure them:
| Tool | Why |
|---|---|
bowtie |
Off-target alignment against the zebrafish genome |
twoBitToFa (UCSC kent utils) |
Extract genomic sequence around target |
primer3 (optional) |
Primer design around the chosen guide |
danRer11.2bit |
UCSC two-bit genome file (~770 MB, auto-downloadable) |
| Bowtie index | Built from the zebrafish FASTA via bowtie-build |
You'll also want the included genetable/danRer11.gene_table (~4.8 MB) which the wizard already points at.
Repository layout
.
├── chopchop.py # Py2: core guide-design engine
├── chopchop_query.py # Py2: batch wrapper, one TSV per gene
├── reformat.py # Py3 OK: Ensembl REST exon-aware post-processor
├── featurization.py # Py2: ML feature builder (Doench 2016)
├── config.json # default tool paths (empty; copy to config_local.json)
├── frontend/ # NEW: Python-3 FastAPI web UI (subprocesses chopchop_query.py)
│ ├── main.py
│ ├── index.html
│ ├── requirements.txt
│ └── run.sh
├── bin/zebrachop-setup # NEW: Python-3 interactive setup wizard
├── genetable/danRer11.gene_table
├── results/ # example output TSVs (rx3, tbx16, tbxta)
├── ZebraCHOP Command Line Batch Analysis Pipeline.pdf
└── LICENSE
Configuration (config_local.json)
The wizard writes this automatically. Manual example:
{
"PATH": {
"BOWTIE": "/usr/local/bin/bowtie",
"TWOBITTOFA": "/usr/local/bin/twoBitToFa",
"PRIMER3": "/usr/local/bin/primer3_core",
"TWOBIT_INDEX_DIR": "/data/zebrachop/indexes",
"BOWTIE_INDEX_DIR": "/data/zebrachop/indexes",
"GENE_TABLE_INDEX_DIR": "./genetable"
},
"THREADS": 4,
"PYTHON2": "/usr/local/bin/python2"
}
config_local.json is git-ignored.
Web UI
Run bash frontend/run.sh and open http://127.0.0.1:8000. Paste one or more gene names, pick the scoring model, click Design guides — the frontend spawns a Python-2 subprocess per batch, streams a job log, and renders each gene's ranked TSV as a sortable table you can download.
Citing
If you use ZebraCHOP, please cite the upstream CHOPCHOP paper (Labun et al., NAR 2019). The poison-exon-aware Ensembl post-processing in reformat.py is an addition specific to this fork.
License
MIT — see LICENSE. Portions of the code derive from CHOPCHOP (MIT) and from Microsoft's Azimuth (BSD 3-Clause); see NOTICE and in-file headers for attributions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file zebrachop-1.0.2.tar.gz.
File metadata
- Download URL: zebrachop-1.0.2.tar.gz
- Upload date:
- Size: 15.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
90bb6261d787955b5e9a5aa762827f52601a2c58c08a58ff5f84456b52da2585
|
|
| MD5 |
a7f13f07ebd87e8d66005d6908002a7d
|
|
| BLAKE2b-256 |
35877dd17d59dca1301e9d6399035e50af9636100bc7eb1526cf2016c489c4a3
|
File details
Details for the file zebrachop-1.0.2-py3-none-any.whl.
File metadata
- Download URL: zebrachop-1.0.2-py3-none-any.whl
- Upload date:
- Size: 12.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d34d453d6d314eca892953d5e5f9db74dc2ef21f8c65987fc6149fa16073cb47
|
|
| MD5 |
03f3ae0d5bd6a22da8c65ccf7f435c1b
|
|
| BLAKE2b-256 |
d69b0943354203b699b302e728491b02a254cc0f99740b80c01866551f86a3f7
|