Skip to main content

Plot Oxford Nanopore variation as self-contained HTML reports.

Project description

                 _                       _              .--.
 _ __ ___   ___ | | __ _ _ __ ___   ___ | | __ _      _/    \___
| '_ ` _ \ / _ \| |/ _` | '_ ` _ \ / _ \| |/ _` |    ( o        )
| | | | | | (_) | | (_| | | | | | | (_) | | (_| |     \___..___/
|_| |_| |_|\___/|_|\__,_|_| |_| |_|\___/|_|\__,_|         ||

A Python plotting tool for Oxford Nanopore variation data. One VCF in, one self-contained HTML report out. molamola inspects the VCF header and picks the right plot type automatically — no flags or subcommands to remember:

  • SV / cytogenetics report for long-read SV VCFs (Sniffles2 / cuteSV / SVIM / pbsv / NanoVar). Cytoband-ideogram circos plot plus a linear genome SV map with per-type density tracks (INS / DEL / DUP / INV) and BND arcs.
  • Per-gene phased-haplotype panels for phased + VEP-annotated small-variant VCFs (WhatsHap / HiPhase). One panel per candidate gene: canonical-transcript exon track, H1 / H2 hap lines, mint phase blocks across both haps, ClinVar-coloured missense lollipops and synonymous-variant ticks for context.

Both produce one self-contained HTML report — figures embedded as base64 PNGs, no external assets, opens offline.

Install

pip install molamola

Or for development from a clone:

git clone https://github.com/martinandclaude/molamola.git
cd molamola
pip install -e .[dev]
pytest -v

Quick start

# Long-read SV VCF (Sniffles2 etc.) → cytogenetics report
molamola --vcf sample.sniffles.vcf
open path/to/sample.report.html

# Phased + VEP-annotated VCF → compound-het workup, all candidate genes
molamola --vcf sample.phased.vep.vcf.gz
open path/to/sample.compound_het.report.html

# Just one gene from a phased + VEP VCF
molamola --vcf sample.phased.vep.vcf.gz --gene NEB

The plot type is auto-detected from the VCF header: ##INFO=<ID=SVTYPE> selects SV mode; ##INFO=<ID=CSQ> + ##FORMAT=<ID=PS> selects compound-het mode. VCFs that match neither shape are refused with a clear error.

What it produces

SV / cytogenetics report

  • Circos plot (pyCirclize) — cytoband ideogram with BND ribbons; line thickness scaled by SUPPORT, colour by VAF.
  • Linear genome SV map — chr1 → chrY, one row each. Greyscale ISCN-style cytobands, four per-type density strips (INS = blue, DEL = red, DUP = green, INV = purple) at 1 Mb bins, BND arcs above. Annotated with ISCN nomenclature like t(7;17)(q11.23;q12).
  • Two noise heuristics that work on the VCF alone — no external reference data needed: acrocentric short-arm BNDs (chr13/14/15/21/22 p-arms; on by default for hg38, off for T2T) and coverage-anomaly BNDs (max(COVERAGE) >= --cov-ratio × baseline AND VAF < --cov-vaf-max).
  • Supports hg38 and T2T-CHM13v2.0 via bundled cytobands.

Compound-het panels

  • One panel per gene: IGV-style blue canonical-transcript exon track on top, two horizontal H1 / H2 hap lines, mint phase-block rectangles spanning both haps (with off-edge arrows when a block stretches past the gene window), ClinVar-coloured missense lollipops hanging downward, synonymous-variant x markers on the hap line for context.
  • Auto-select sweep when --gene is omitted: gene qualifies iff at least one trans pair has one variant in ClinVar P/LP or VUS and the partner is not benign. The report splits results into a strict section (both variants P/LP or VUS — true compound-het) and an extended section (anchor P/LP-or-VUS, partner conflicting / no-ClinVar / P/LP / VUS). The strict heading is shown even when its subset is empty so the dichotomy is always visible.
  • Use --gene SYMBOL to plot a specific gene regardless of the auto-select rule (useful for manual review of P/LP + benign or no-ClinVar + no-ClinVar pairs).
  • Tunable via --min-pair-count (raise for stricter sweeps) and --max-genes (default 50).
  • hg38-only: ClinVar coordinates are hg38, and coordinate-based lookup is the matching path.

ClinVar usage in compound-het is purely a colour key on data points — no clinical interpretation is performed or implied.

Bundled references

All in molamola/data/:

  • cytoBand.txt.gz (hg38), cytoBand.t2t.txt.gz (T2T-CHM13v2.0) — UCSC cytoband annotations for SV mode.
  • canonical_exons.hg38.tsv.gz — MANE Select v1.x canonical transcripts and exon coordinates.
  • clinvar.hg38.tsv.xz — molamola's reduced ClinVar TSV (chrom, pos, ref, alt, significance bucket; xz-compressed). Release date logged in each report's run-metadata.

Bundled-only by design: molamola does not auto-download or look up online. Use --clinvar PATH or --canonical-exons PATH to override.

The reduced TSVs are reproducibly regeneratable from public sources via scripts/derive_canonical_exons.py and scripts/derive_clinvar_for_molamola.py.

CLI

molamola --vcf VCF [--out DIR] [--reference hg38|t2t] [...]

Full flag list: docs/CLI.md. Filter explanations: docs/FILTERS.md. Output formats: docs/OUTPUTS.md. Worked examples: docs/EXAMPLES.md. Per-release changes: CHANGELOG.md.

Development

pytest -v
ruff check .

CI runs lint + pytest on Python 3.10 / 3.11 / 3.12 on every push and PR.

Acknowledgements

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

molamola-0.1.0.tar.gz (14.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

molamola-0.1.0-py3-none-any.whl (14.8 MB view details)

Uploaded Python 3

File details

Details for the file molamola-0.1.0.tar.gz.

File metadata

  • Download URL: molamola-0.1.0.tar.gz
  • Upload date:
  • Size: 14.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for molamola-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0318335cd45fb56bdb4bcd1d915af00b427ef58f84a2434afaf60ac91d9310c8
MD5 8a64cf1fc09a5813f48f002bb7d34d6e
BLAKE2b-256 efd05b6fc599c23ed37e3270ff26192ead66ac1fc7a6e44a701a85c4925c52c6

See more details on using hashes here.

File details

Details for the file molamola-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: molamola-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 14.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for molamola-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 002eda774cffe5986973df0b87ece5a38c4dae8cc096b505373410211c08015e
MD5 8e51a8980167d80b7f4c156875408f04
BLAKE2b-256 396a4fda5379c4783af9f8b0a777c15f7b3e0da1d237ae142d77204c0d155880

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page