Plot Oxford Nanopore variation as self-contained HTML reports.
Project description
_ _ .--.
_ __ ___ ___ | | __ _ _ __ ___ ___ | | __ _ _/ \___
| '_ ` _ \ / _ \| |/ _` | '_ ` _ \ / _ \| |/ _` | ( o )
| | | | | | (_) | | (_| | | | | | | (_) | | (_| | \___..___/
|_| |_| |_|\___/|_|\__,_|_| |_| |_|\___/|_|\__,_| ||
A Python plotting tool for Oxford Nanopore variation data. One VCF in, one self-contained HTML report out. molamola inspects the VCF header and picks the right plot type automatically — no flags or subcommands to remember:
- SV / cytogenetics report for long-read SV VCFs (Sniffles2 / cuteSV / SVIM / pbsv / NanoVar). Cytoband-ideogram circos plot plus a linear genome SV map with per-type density tracks (INS / DEL / DUP / INV) and BND arcs.
- Per-gene phased-haplotype panels for phased + VEP-annotated small-variant VCFs (WhatsHap / HiPhase). One panel per candidate gene: canonical-transcript exon track, H1 / H2 hap lines, mint phase blocks across both haps, ClinVar-coloured missense lollipops and synonymous-variant ticks for context.
Both produce one self-contained HTML report — figures embedded as base64 PNGs, no external assets, opens offline.
Install
pip install molamola
Or for development from a clone:
git clone https://github.com/martinandclaude/molamola.git
cd molamola
pip install -e .[dev]
pytest -v
Quick start
# Long-read SV VCF (Sniffles2 etc.) → cytogenetics report
molamola --vcf sample.sniffles.vcf
open path/to/sample.report.html
# Phased + VEP-annotated VCF → compound-het workup, all candidate genes
molamola --vcf sample.phased.vep.vcf.gz
open path/to/sample.compound_het.report.html
# Just one gene from a phased + VEP VCF
molamola --vcf sample.phased.vep.vcf.gz --gene NEB
The plot type is auto-detected from the VCF header: ##INFO=<ID=SVTYPE> selects SV mode; ##INFO=<ID=CSQ> + ##FORMAT=<ID=PS> selects compound-het mode. VCFs that match neither shape are refused with a clear error.
What it produces
SV / cytogenetics report
- Circos plot (pyCirclize) — cytoband ideogram with BND ribbons; line thickness scaled by
SUPPORT, colour by VAF. - Linear genome SV map — chr1 → chrY, one row each. Greyscale ISCN-style cytobands, four per-type density strips (INS = blue, DEL = red, DUP = green, INV = purple) at 1 Mb bins, BND arcs above. Annotated with ISCN nomenclature like
t(7;17)(q11.23;q12). - Two noise heuristics that work on the VCF alone — no external reference data needed: acrocentric short-arm BNDs (chr13/14/15/21/22 p-arms; on by default for hg38, off for T2T) and coverage-anomaly BNDs (
max(COVERAGE) >= --cov-ratio × baseline AND VAF < --cov-vaf-max). - Supports hg38 and T2T-CHM13v2.0 via bundled cytobands.
Compound-het panels
- One panel per gene: IGV-style blue canonical-transcript exon track on top, two horizontal H1 / H2 hap lines, mint phase-block rectangles spanning both haps (with off-edge arrows when a block stretches past the gene window), ClinVar-coloured missense lollipops hanging downward, synonymous-variant
xmarkers on the hap line for context. - Auto-select sweep when
--geneis omitted: gene qualifies iff at least one trans pair has one variant in ClinVar P/LP or VUS and the partner is not benign. The report splits results into astrictsection (both variants P/LP or VUS — true compound-het) and anextendedsection (anchor P/LP-or-VUS, partner conflicting / no-ClinVar / P/LP / VUS). The strict heading is shown even when its subset is empty so the dichotomy is always visible. - Use
--gene SYMBOLto plot a specific gene regardless of the auto-select rule (useful for manual review of P/LP + benign or no-ClinVar + no-ClinVar pairs). - Tunable via
--min-pair-count(raise for stricter sweeps) and--max-genes(default 50). - hg38-only: ClinVar coordinates are hg38, and coordinate-based lookup is the matching path.
ClinVar usage in compound-het is purely a colour key on data points — no clinical interpretation is performed or implied.
Bundled references
All in molamola/data/:
cytoBand.txt.gz(hg38),cytoBand.t2t.txt.gz(T2T-CHM13v2.0) — UCSC cytoband annotations for SV mode.canonical_exons.hg38.tsv.gz— MANE Select v1.x canonical transcripts and exon coordinates.clinvar.hg38.tsv.xz— molamola's reduced ClinVar TSV (chrom, pos, ref, alt, significance bucket; xz-compressed). Release date logged in each report's run-metadata.
Bundled-only by design: molamola does not auto-download or look up online. Use --clinvar PATH or --canonical-exons PATH to override.
The reduced TSVs are reproducibly regeneratable from public sources via scripts/derive_canonical_exons.py and scripts/derive_clinvar_for_molamola.py.
CLI
molamola --vcf VCF [--out DIR] [--reference hg38|t2t] [...]
Full flag list: docs/CLI.md. Filter explanations: docs/FILTERS.md. Output formats: docs/OUTPUTS.md. Worked examples: docs/EXAMPLES.md. Per-release changes: CHANGELOG.md.
Development
pytest -v
ruff check .
CI runs lint + pytest on Python 3.10 / 3.11 / 3.12 on every push and PR.
Acknowledgements
- Sniffles2, cuteSV, SVIM, pbsv, NanoVar — long-read SV callers.
- WhatsHap, HiPhase — long-read phasing.
- VEP, MANE Select, ClinVar — variant annotation and significance.
- pyCirclize — circos plot.
- matplotlib, numpy.
- bcftools / samtools / htslib — VCF pre-processing helpers.
- UCSC Genome Browser — hg38 and T2T-CHM13v2.0 cytobands.
- iconsdb.com — header fish icon (deep-pink, mirrored).
License
MIT.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file molamola-0.1.0.tar.gz.
File metadata
- Download URL: molamola-0.1.0.tar.gz
- Upload date:
- Size: 14.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0318335cd45fb56bdb4bcd1d915af00b427ef58f84a2434afaf60ac91d9310c8
|
|
| MD5 |
8a64cf1fc09a5813f48f002bb7d34d6e
|
|
| BLAKE2b-256 |
efd05b6fc599c23ed37e3270ff26192ead66ac1fc7a6e44a701a85c4925c52c6
|
File details
Details for the file molamola-0.1.0-py3-none-any.whl.
File metadata
- Download URL: molamola-0.1.0-py3-none-any.whl
- Upload date:
- Size: 14.8 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
002eda774cffe5986973df0b87ece5a38c4dae8cc096b505373410211c08015e
|
|
| MD5 |
8e51a8980167d80b7f4c156875408f04
|
|
| BLAKE2b-256 |
396a4fda5379c4783af9f8b0a777c15f7b3e0da1d237ae142d77204c0d155880
|