Skip to main content

CRISPR gRNA design for zebrafish with Ensembl-aware filtering — Python 3 frontend wrapping the CHOPCHOP engine.

Project description

ZebraCHOP

CRISPR guide-RNA design for zebrafish (Danio rerio), with Ensembl-aware post-processing and a clean web UI.

ZebraCHOP is a zebrafish-focused fork of CHOPCHOP. It picks high-scoring Cas9/TALEN/Cpf1/Nickase guides for any zebrafish gene, scores them with the published efficiency models (Doench 2016, Xu 2015, …), maps off-targets with Bowtie, and post-processes the output via Ensembl REST to bias selection toward early exons (more likely to produce loss-of-function).

status: ready CLI: Python 2.7 web: Python 3.9+ license: MIT DOI CI


What it does

   Gene name(s)
       │
       ▼
┌────────────────┐   ┌────────────────┐   ┌────────────────┐   ┌────────────────┐
│ Resolve via    │ → │ Extract        │ → │ Score guides   │ → │ Filter via     │
│ danRer11.gene  │   │ sequence with  │   │ + off-targets  │   │ Ensembl exon   │
│ _table         │   │ twoBitToFa     │   │ via Bowtie     │   │ structure      │
└────────────────┘   └────────────────┘   └────────────────┘   └────────────────┘
       │                                                              │
       └──────────────────────────► Per-gene .tsv (ranked) ◄──────────┘

Inputs: gene name(s) (or a genePred file). Outputs: one TSV per gene with ranked guides, off-target counts, GC%, self-complementarity, efficiency score, etc., plus an optional Ensembl-aware text summary.


Quick start

git clone https://github.com/abachu2005/ZebraCHOP.git
cd ZebraCHOP
python3 bin/zebrachop-setup     # interactive: detects bowtie/twoBitToFa/primer3, writes config_local.json
bash frontend/run.sh            # open http://127.0.0.1:8000

If you prefer the CLI:

python2 chopchop_query.py --gene_names rx3,tbx16 -o results/ -- -G danRer11
python  reformat.py -input results/ -output ranked_summary.txt

Why two Python versions?

The core CHOPCHOP engine (chopchop.py, chopchop_query.py, featurization.py) is Python 2.7 — it's a fork of upstream CHOPCHOP and depends on a frozen scikit-learn pickle (Doench_2016_18.01_model_nopos.pickle) that doesn't survive a clean Py3 port. The new web frontend and setup wizard are Python 3 and shell out to the Py2 engine as a subprocess.

The setup wizard records your Python-2 path under config_local.json["PYTHON2"] so you only have to set it once.


External dependencies

The setup wizard checks for these and helps you install/configure them:

Tool Why
bowtie Off-target alignment against the zebrafish genome
twoBitToFa (UCSC kent utils) Extract genomic sequence around target
primer3 (optional) Primer design around the chosen guide
danRer11.2bit UCSC two-bit genome file (~770 MB, auto-downloadable)
Bowtie index Built from the zebrafish FASTA via bowtie-build

You'll also want the included genetable/danRer11.gene_table (~4.8 MB) which the wizard already points at.


Repository layout

.
├── chopchop.py                  # Py2: core guide-design engine
├── chopchop_query.py            # Py2: batch wrapper, one TSV per gene
├── reformat.py                  # Py3 OK: Ensembl REST exon-aware post-processor
├── featurization.py             # Py2: ML feature builder (Doench 2016)
├── config.json                  # default tool paths (empty; copy to config_local.json)
├── frontend/                    # NEW: Python-3 FastAPI web UI (subprocesses chopchop_query.py)
│   ├── main.py
│   ├── index.html
│   ├── requirements.txt
│   └── run.sh
├── bin/zebrachop-setup          # NEW: Python-3 interactive setup wizard
├── genetable/danRer11.gene_table
├── results/                     # example output TSVs (rx3, tbx16, tbxta)
├── ZebraCHOP Command Line Batch Analysis Pipeline.pdf
└── LICENSE

Configuration (config_local.json)

The wizard writes this automatically. Manual example:

{
  "PATH": {
    "BOWTIE": "/usr/local/bin/bowtie",
    "TWOBITTOFA": "/usr/local/bin/twoBitToFa",
    "PRIMER3": "/usr/local/bin/primer3_core",
    "TWOBIT_INDEX_DIR": "/data/zebrachop/indexes",
    "BOWTIE_INDEX_DIR":  "/data/zebrachop/indexes",
    "GENE_TABLE_INDEX_DIR": "./genetable"
  },
  "THREADS": 4,
  "PYTHON2": "/usr/local/bin/python2"
}

config_local.json is git-ignored.

Web UI

Run bash frontend/run.sh and open http://127.0.0.1:8000. Paste one or more gene names, pick the scoring model, click Design guides — the frontend spawns a Python-2 subprocess per batch, streams a job log, and renders each gene's ranked TSV as a sortable table you can download.

Citing

If you use ZebraCHOP, please cite the upstream CHOPCHOP paper (Labun et al., NAR 2019). The poison-exon-aware Ensembl post-processing in reformat.py is an addition specific to this fork.

License

MIT — see LICENSE. Portions of the code derive from CHOPCHOP (MIT) and from Microsoft's Azimuth (BSD 3-Clause); see NOTICE and in-file headers for attributions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zebrachop-1.0.2.tar.gz (15.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zebrachop-1.0.2-py3-none-any.whl (12.5 kB view details)

Uploaded Python 3

File details

Details for the file zebrachop-1.0.2.tar.gz.

File metadata

  • Download URL: zebrachop-1.0.2.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for zebrachop-1.0.2.tar.gz
Algorithm Hash digest
SHA256 90bb6261d787955b5e9a5aa762827f52601a2c58c08a58ff5f84456b52da2585
MD5 a7f13f07ebd87e8d66005d6908002a7d
BLAKE2b-256 35877dd17d59dca1301e9d6399035e50af9636100bc7eb1526cf2016c489c4a3

See more details on using hashes here.

File details

Details for the file zebrachop-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: zebrachop-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 12.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for zebrachop-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d34d453d6d314eca892953d5e5f9db74dc2ef21f8c65987fc6149fa16073cb47
MD5 03f3ae0d5bd6a22da8c65ccf7f435c1b
BLAKE2b-256 d69b0943354203b699b302e728491b02a254cc0f99740b80c01866551f86a3f7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page