Skip to main content

TRACE: Triple-aligner Read Analysis for CRISPR Editing

Project description

TRACE

Triple-aligner Read Analysis for CRISPR Editing

Features

  • Triple-aligner consensus: Uses BWA-MEM, BBMap, and minimap2 for robust alignment
  • Automatic inference: Detects PAM, cleavage site, homology arms, and edits from sequences
  • K-mer classification: Fast pre-alignment HDR/WT detection using 12-mers
  • Multi-nuclease support: Cas9 and Cas12a (Cpf1) with correct cleavage geometry
  • Auto-detection: Library type (TruSeq/Tn5), read merging need, CRISPResso mode
  • CRISPResso2 integration: Validation with standard CRISPR analysis tool

Installation

pip (Python package only)

pip install trace-crispr

conda (includes external aligners)

conda install -c bioconda -c conda-forge trace-crispr

Development installation

git clone https://github.com/k-roy/trace.git
cd trace
pip install -e ".[dev]"

Quick Start

Minimal run (3 required inputs)

trace run \
  --reference amplicon.fasta \
  --hdr-template hdr_template.fasta \
  --guide GCTGAAGCACTGCACGCCGT \
  --r1 sample_R1.fastq.gz \
  --r2 sample_R2.fastq.gz \
  --output results/

Check locus configuration without running

trace info \
  --reference amplicon.fasta \
  --hdr-template hdr_template.fasta \
  --guide GCTGAAGCACTGCACGCCGT

This will print:

=== TRACE Analysis Configuration ===

Reference sequence: 500 bp
HDR template: 500 bp

Donor template analysis:
  - Left homology arm: positions 1-245 on reference (245 bp)
  - Right homology arm: positions 255-500 on reference (245 bp)
  - Donor edits detected at positions: 246, 247 on reference
    * Position 246: C → G (PAM-silencing mutation)
    * Position 247: C → T (chromophore Y66H mutation)

Guide analysis:
  - Guide sequence: GCTGAAGCACTGCACGCCGT
  - Guide targets: positions 248-267 on reference (- strand)
  - PAM: GGG at positions 245-247 on reference
  - Cleavage site: position 248 on reference

Multiple samples

Create a sample key TSV:

sample_id	r1_path	r2_path	condition
sample_1	/path/to/S1_R1.fastq.gz	/path/to/S1_R2.fastq.gz	treatment
sample_2	/path/to/S2_R1.fastq.gz	/path/to/S2_R2.fastq.gz	control

Then run:

trace run \
  --reference amplicon.fasta \
  --hdr-template hdr_template.fasta \
  --guide GCTGAAGCACTGCACGCCGT \
  --sample-key samples.tsv \
  --output results/ \
  --threads 16

Using Cas12a

trace run \
  --reference amplicon.fasta \
  --hdr-template hdr_template.fasta \
  --guide GCTGAAGCACTGCACGCCGTAA \
  --nuclease cas12a \
  --sample-key samples.tsv \
  --output results/

Nuclease Support

Cas9 (SpCas9)

  • PAM: NGG (3' of protospacer)
  • Cleavage: 3 bp upstream of PAM (blunt ends)

Cas12a (LbCpf1)

  • PAM: TTTN (5' of protospacer)
  • Cleavage: 18-19 bp downstream on target strand, 23 bp on non-target
  • Creates 4-5 nt 5' overhang (staggered cut)

Output

The main output is a TSV file with per-sample editing outcomes:

Column Description
sample Sample ID
classifiable_reads Total classifiable reads
duplicate_rate PCR duplicate rate (Tn5)
Dedup_WT_% Wild-type % (deduplicated)
Dedup_HDR_% HDR % (deduplicated)
Dedup_NHEJ_% NHEJ % (deduplicated)
Dedup_LgDel_% Large deletion %
kmer_hdr_rate K-mer method HDR rate
crispresso_hdr_rate CRISPResso2 HDR rate

Dependencies

Python

  • click>=8.0
  • pysam>=0.20
  • pandas>=1.5
  • numpy>=1.20
  • pyyaml>=6.0
  • rapidfuzz>=3.0
  • tqdm>=4.60

External tools (via conda)

  • bwa>=0.7
  • bbmap>=39
  • minimap2>=2.24
  • samtools>=1.16
  • crispresso2 (optional, but enabled by default)

Author

Kevin R. Roy

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trace_crispr-0.1.0.tar.gz (41.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

trace_crispr-0.1.0-py3-none-any.whl (50.2 kB view details)

Uploaded Python 3

File details

Details for the file trace_crispr-0.1.0.tar.gz.

File metadata

  • Download URL: trace_crispr-0.1.0.tar.gz
  • Upload date:
  • Size: 41.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for trace_crispr-0.1.0.tar.gz
Algorithm Hash digest
SHA256 760228fcdc26323af593a331e43a68bc111af4237d29631644b76c0961bd71eb
MD5 e5cc6ee3ec32a852d14407df492e643c
BLAKE2b-256 19410219478305e61b012447a36940d4970a049071ab3c30ccd55602a922e52a

See more details on using hashes here.

File details

Details for the file trace_crispr-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: trace_crispr-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 50.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for trace_crispr-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b406816c48ea374b4d0ecc440c4f8e37631bccdc83ea43e1eb9c2f45a02be4a7
MD5 34416d6ab342ba5c561ad242608d4ad6
BLAKE2b-256 b15ce98a52ac0b2db2ab8db0aa7dcc3bb02e80c67fca2d576e53dc545cf82b5e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page