CPU-friendly sequence-only CRISCross off-target prediction with a scikit-learn-style API. Optional genome scanning (criscross.offinder) needs a system OpenCL runtime; see README.
Project description
criscross
CPU-friendly sequence-only CRISCross off-target prediction with a scikit-learn-style API. The pretrained model weights ship inside the wheel (fp16 + zstd-compressed) so no extra downloads are required.
Install
pip install criscross
Optional marker extra (PEP 621 optional-dependencies): use
pip install criscross[offinder] to signal you intend to use genome scanning.
It does not pull OpenCL from PyPI (that is impossible); it exists so
install lines, docs, and CI can reference one discoverable extra. See the
Python Packaging User Guide on optional dependencies.
CPU-only install (no CUDA libraries pulled in):
pip install criscross --extra-index-url https://download.pytorch.org/whl/cpu
If you plan to use criscross.offinder (important)
pip install criscross is enough for model inference (sequence_model.predict(...)).
If you also want genome scanning via criscross.offinder.prepare(...), you must
install an OpenCL runtime on the machine. Cas-OFFinder requires OpenCL even in
CPU mode. This follows the usual pattern for Python packages that wrap native tools:
document system dependencies, expose optional extras for discoverability,
and fail with a clear error when the feature is used without the runtime.
Check your environment before a long scan:
from criscross.offinder import check_opencl, opencl_setup_instructions
print(check_opencl())
if not check_opencl()["ok"]:
print(opencl_setup_instructions())
Linux (Ubuntu/Debian):
sudo apt update
sudo apt install pocl-opencl-icd
Conda (cross-platform option):
conda install -c conda-forge pocl ocl-icd-system
If OpenCL is missing, offinder.prepare(...) will fail with an error like
clGetPlatformIDs Failed: -1001.
Quickstart
from criscross import sequence_model
import pandas as pd
# Single datapoint (dict)
prob = sequence_model.predict({
"Guide_sequence": "GCTCGGGGACACAGGATCCCTGG", # 23 nt
"off_target_512nt": "GCAG...TGCC", # 512 nt, RC for - strand
"strand_id": 1, # 1 for +, 0 for -
})
print(prob) # float in [0, 1]
# Dataset (DataFrame or CSV path)
df = pd.read_csv("examples/sample_input.csv")
probs = sequence_model.predict(df) # -> np.ndarray, shape [N]
probs = sequence_model.predict("examples/sample_input.csv") # same
Preparing inputs from a genome scan (Cas-OFFinder)
If you have guide RNA(s) and a reference genome FASTA, you can generate the
Guide_sequence/off_target_512nt/strand_id table with Cas-OFFinder and feed
it directly into sequence_model.predict(...).
from criscross import offinder, sequence_model
X = offinder.prepare(
guide_rnas=["GCTCGGGGACACAGGATCCCTGG"],
fasta="/path/to/GRCh38.primary_assembly.genome.fa",
pam="NGG", # default
max_mismatches=6, # default
)
# X is a DataFrame you can pass straight to criscross
probs = sequence_model.predict_proba(X)
Requirements:
- Cas-OFFinder needs an OpenCL runtime even for CPU mode. On Linux, the
simplest CPU runtime is PoCL, e.g.
conda install -c conda-forge pocl ocl-icd-system(orsudo apt install pocl-opencl-icd). - Cas-OFFinder 2.4.1 is bundled inside the criscross wheel. If you prefer
to use your own build, set
CAS_OFFINDER=/path/to/cas-offinder(or passcas_offinder_path=).
Accepted inputs to predict(X)
X |
Returned |
|---|---|
dict / pandas.Series with the 3 required keys |
float |
(guide, off_target_512nt, strand_id) 3-tuple |
float |
pandas.DataFrame with the 3 required columns |
np.ndarray shape [N] |
list of dicts |
np.ndarray shape [N] |
str / pathlib.Path pointing to a CSV with the 3 columns |
np.ndarray shape [N] |
Required columns/keys:
| key | dtype | meaning |
|---|---|---|
Guide_sequence |
23nt string | sgRNA guide sequence |
off_target_512nt |
512nt string | candidate off-target window, already reverse-complemented for - strand |
strand_id |
int 0/1 | 1 for + strand, 0 for - |
CLI
criscross predict --csv examples/sample_input.csv --out preds.csv
If the input CSV also has a label column (0/1), AUPRC is printed to stderr.
Loading a custom checkpoint
from criscross import sequence_model
sequence_model.load("path/to/my_model.pt") # fp32 raw .pt
sequence_model.load("path/to/my_model.pt.zst") # zstd-compressed fp16
Inspecting the model
sequence_model.config() # hyperparameters used to build CRISCross(**config)
sequence_model.metadata() # versions, training-time test_auprc, input/output signature, seed
Citation
If you use this package in research, please cite the upstream CRISCross work. This package is a CPU-only, sequence-only repackaging of that model.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file criscross-0.1.6.tar.gz.
File metadata
- Download URL: criscross-0.1.6.tar.gz
- Upload date:
- Size: 70.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
faa2232f3b8d057f9546d556c66a6c0d8d2c79bed7090aa53039342f9bfd4832
|
|
| MD5 |
7c73c58e6f0ce0112ac8e74796d9e2ea
|
|
| BLAKE2b-256 |
c4943c5fe005adc5ea6930afd06ff599f4bcd428559c01dd08f300862fd8c33e
|
File details
Details for the file criscross-0.1.6-py3-none-any.whl.
File metadata
- Download URL: criscross-0.1.6-py3-none-any.whl
- Upload date:
- Size: 70.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
14ec08ffe673e01441a92bb63bccaacbd91ce45432929a0a94ab0e07b87faf5f
|
|
| MD5 |
f0a5929ef3fa039309926153ceb5bb7f
|
|
| BLAKE2b-256 |
4695638fe644bccd1408848a193208b595b52fd89011b90078b205e4bce10ba4
|