SniffCell: Annotate SVs cell type based on CpG methylation
Project description
SniffCell
SniffCell annotates structural variants (SVs) using long-read methylation evidence and cell-type-specific ctDMR signals.
Installation
pip install sniffcell # from PyPI
pip install -e . # local development
Requires Python >=3.10.
Commands
sniffcell {find, deconv, anno, svanno, dmsv, viz, igvviz, report}
Typical Workflow
- Call ctDMRs from an atlas with
find. - Annotate SVs with ctDMR evidence using
anno. - Re-run SV assignment from saved read tables with
svanno(optional). - Generate an HTML review report with
report. - Visualize individual SVs with
vizorigvviz. - Test differential methylation near SVs with
dmsv(optional). - Deconvolve cell-type composition from any BAM with
deconv(optional).
find: Call ctDMRs From an Atlas
Loads an atlas methylation matrix and calls cell-type-specific differentially methylated regions (ctDMRs).
sniffcell find \
-n atlas/all_celltypes_blocks.npy \
-i atlas/all_celltypes_blocks.index.gz \
-cf atlas/index_to_major_celltypes.json \
-m atlas/all_celltypes.txt \
-ck pbmc \
-o pbmc_ctdmr.tsv \
--diff_threshold 0.40 \
--min_rows 2 \
--min_cpgs 3 \
--max_gap_bp 500
If
-n/-i/-cf/-mare omitted, paths default to./atlas/...in your working directory.
-ck/--celltypes_keys selects a top-level JSON key mapping {group_name: [sample_id, ...]}.
Outputs:
<output>— annotation-ready ctDMR TSV<output>.igv.bed— IGV BED9 companion file
anno: Annotate SVs With ctDMRs
Classifies reads per ctDMR region, then assigns cell-type codes to each SV.
sniffcell anno \
-i sample.bam \
-v sample.vcf.gz \
-r ref.fa \
-b pbmc_ctdmr.tsv \
-o anno_out \
-w 10000 \
--breakpoint_exclusion_frac 0.1 \
-t 8 \
--evidence_mode all_rows \
--min_overlap_pct 0.0 \
--min_agreement_pct 1.0
Key options:
--evidence_mode {all_rows,per_read}— how ctDMR evidence is aggregated (default:all_rows)--breakpoint_exclusion_frac— excludes ctDMRs within±frac × SVLENof the breakpoint (default:0.0)--min_overlap_pct/--min_agreement_pct— filtering thresholds
assigned_codeis suppressed whenhas_hard_conflict=True.
Outputs in <output>/:
reads_classification.tsvblocks_classification.tsvsv_assignment.tsv/sv_assignment_readable.tsv/sv_assignment_readable_long.tsvanno_run_manifest.json
svanno: Recompute SV Assignments
Re-runs only the SV assignment step from an existing reads_classification.tsv, useful for tuning thresholds without re-processing the BAM.
sniffcell svanno \
-v sample.vcf.gz \
-i anno_out/reads_classification.tsv \
-w 10000 \
--breakpoint_exclusion_frac 0.1 \
--evidence_mode all_rows \
--min_overlap_pct 0.0 \
--min_agreement_pct 1.0 \
-o anno_out
deconv: Cell-Type Deconvolution
Assigns every read in a BAM a cell-type code using ctDMR methylation patterns, then produces per-read, per-group, and whole-sample summaries.
sniffcell deconv \
-i sample.bam \
-r ref.fa \
-b pbmc_ctdmr.tsv \
-o deconv_out \
-t 8 \
--read_assignment_mode closest_reference_mean
Key options:
--read_assignment_mode {closest_reference_mean,kmeans}— assignment algorithm (default:closest_reference_mean)--split_bam_groups— after deconvolution, split reads into per-group BAMs. Use;between groups and,between labels within a group. Named splits use=. Example:lymph=t_cell,b_cell,nk_cell;myeloid=monocyte--resume— skip ctDMR classification and reload existing TSVs; useful for re-splitting without reprocessing
Outputs in <output>/:
deconv_reads_classification.tsv— one row per (read × ctDMR)deconv_blocks_classification.tsv— per-ctDMR block summarydeconv_read_summary.tsv— one row per read with majority cell type and linked celltypesdeconv_summary.tsv— whole-sample summary inall_rowsandper_readmodesdeconv_reads_by_group/— per-group read tables (split bybest_group)deconv_requested_group_splits/— user-defined BAM and TSV splits (when--split_bam_groupsis used)deconv_run_manifest.json
viz: Visualize One SV
Renders a figure (PNG or PDF) for a single SV with read-level methylation and ctDMR context.
# Minimal — loads inputs from anno manifest
sniffcell viz \
--anno_output anno_out \
-s sniffles.SV123
# Manual mode
sniffcell viz \
-i sample.bam \
-v sample.vcf.gz \
-s sniffles.SV123 \
-r ref.fa \
-b pbmc_ctdmr.tsv \
-a anno_out/reads_classification.tsv \
-o figures/sniffles.SV123 \
-f png
Notable options:
--indel_min_bp— overlay read-level indels ≥ N bp on reads (default:40; set to0to disable)--linked_ctdmr_mode {distal,extend,strict}— controls how off-window winning ctDMRs are displayed (default:distal)--export_tables— also write.summary.tsv,.supporting_reads_assignment.tsv, and.supporting_reads_ctdmr_methylation.tsv
igvviz: IGV Screenshots for One SV
Runs IGV batch mode and produces snapshots per BAM, with reads tagged and grouped by phase.
sniffcell igvviz \
-i fans_a.bam fans_b.bam fans_c.bam \
-v sample.vcf.gz \
-s sniffles.SV123 \
-r ref.fa \
-b pbmc_ctdmr.tsv \
-w 10000 \
-o out/igvviz
Notable options:
--anno_output— load inputs from anno manifest (manifest-driven mode)--igv_cmd— path to IGV executable (default:igv.sh)--snapshot_width/--snapshot_height— snapshot dimensions (default:3600×1600)--batch_only— write batch script only, don't run IGV
report: HTML Review Report
Filters high-confidence SVs from anno output and builds an interactive HTML report.
# Basic report
sniffcell report \
--anno_output anno_out \
--min_overlap_pct 0.8 \
--min_majority_pct 1.0
# With viz figures and IGV screenshots
sniffcell report \
--anno_output anno_out \
--with_figures \
--with_igvviz \
--igv_bams fans1.bam fans2.bam fans3.bam \
--figure_threads 4
# With igv-reports alternate page (requires: pip install igv-reports)
sniffcell report \
--anno_output anno_out \
--with_igvreport \
--igv_bams fans1.bam fans2.bam
Default SV filters:
assigned_codemust be non-emptylinked_celltypesmust be non-emptyhas_hard_conflictmust beFalse--min_overlap_pct≥0.8and--min_majority_pct≥1.0
Outputs under <anno_output>/report/:
index.html— interactive report with genome-wide plots and per-SV panelshigh_confidence_sv.tsvfigures/— viz panels (when--with_figures)igvviz/— IGV screenshots (when--with_igvviz)igvreport/index.html— alternate IGV.js page (when--with_igvreport)report_manifest.json
Review labels (Real / Not real / Undecided) auto-save to browser
localStorageand persist across sessions.
dmsv: Differential Methylation Around SVs
Tests for methylation differences between SV-supporting and non-supporting reads near each SV.
sniffcell dmsv \
-i sample.bam \
-v sample.vcf.gz \
-r ref.fa \
-o dmsv_out \
-m 3 \
-f 1000 \
-c 5 \
-t 8
Outputs:
dmsv_out/significant_SVs.tsvdmsv_out/sv_details/<sv_id>.tsv.gz
Wiki
- End-to-end workflow:
wiki/End-to-End-Workflow.md - Test examples:
wiki/Test-Examples.md
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sniffcell-0.7.0.tar.gz.
File metadata
- Download URL: sniffcell-0.7.0.tar.gz
- Upload date:
- Size: 130.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
260abc5228e1184a25b8632ab359b983e71e406061ea523072ad7de65bf82e00
|
|
| MD5 |
4d7a32e0c733221031bd854ea1137d9f
|
|
| BLAKE2b-256 |
f1a6eace3d79334edb9c455611d33dcd31204f450ebd2ba5977aa15d8e3533c0
|
Provenance
The following attestation bundles were made for sniffcell-0.7.0.tar.gz:
Publisher:
python-publish.yml on Fu-Yilei/SniffCell
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sniffcell-0.7.0.tar.gz -
Subject digest:
260abc5228e1184a25b8632ab359b983e71e406061ea523072ad7de65bf82e00 - Sigstore transparency entry: 1108446182
- Sigstore integration time:
-
Permalink:
Fu-Yilei/SniffCell@78cc76c93b98575351f8ef51010edd86c9b2f0f4 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Fu-Yilei
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@78cc76c93b98575351f8ef51010edd86c9b2f0f4 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file sniffcell-0.7.0-py3-none-any.whl.
File metadata
- Download URL: sniffcell-0.7.0-py3-none-any.whl
- Upload date:
- Size: 120.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
31cfbb10568f674a5d42e63977caa8f2486956d055171982908aa6f299aed68d
|
|
| MD5 |
f2af8d8cab6e1ba42918584e8a3e37d6
|
|
| BLAKE2b-256 |
d64cb6b4414f3d36bba75e08f25f0e4509351e2fa9cf0743505e5b7e77303a40
|
Provenance
The following attestation bundles were made for sniffcell-0.7.0-py3-none-any.whl:
Publisher:
python-publish.yml on Fu-Yilei/SniffCell
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sniffcell-0.7.0-py3-none-any.whl -
Subject digest:
31cfbb10568f674a5d42e63977caa8f2486956d055171982908aa6f299aed68d - Sigstore transparency entry: 1108446183
- Sigstore integration time:
-
Permalink:
Fu-Yilei/SniffCell@78cc76c93b98575351f8ef51010edd86c9b2f0f4 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Fu-Yilei
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@78cc76c93b98575351f8ef51010edd86c9b2f0f4 -
Trigger Event:
workflow_dispatch
-
Statement type: