Skip to main content

Multi-evidence visualisation of structural variants between two haplotype assemblies of the same individual

Project description

HapDuo

Multi-evidence visualisation of structural variants between two haplotype assemblies of the same individual.

HapDuo aligns two phased haplotype assemblies, detects inversions from the canonical +++ −−− +++ strand-switch pattern in a chained PAF, and renders a stack of supporting figures — including a per-inversion multi-evidence panel that combines Hi-C contact pyramids, CCS read piles, PAF synteny ribbons, a chromosome-scale cartoon, and close-up zoom panels around all four breakpoints, all on a single page.

Install

pip install hapduo

Requires Python ≥ 3.9, numpy, matplotlib, pysam, and hic-straw. The Snakemake pipeline additionally needs minimap2 and snakemake on PATH.

What you get

After install you have five command-line tools:

Command Output
hapduo-detect Inversion breakpoint TSV from a chained PAF
hapduo-style4 One multi-evidence per-inversion panel (180 × 228 mm, ≥ 6 pt text, TrueType — Illustrator-editable)
hapduo-batch All hapduo-style4 panels for one breakpoints TSV
hapduo-ideogram Single-haplotype ideogram of inversion calls
hapduo-synteny Whole-genome two-haplotype synteny ribbon ideogram

…plus a top-level Snakefile that runs the whole pipeline from a single config file.

Quickstart (manual)

# 1. Align hapB → hapA at two sensitivity levels
minimap2 -cx asm5  --eqx -t 16 hapA.fasta hapB.fasta > hapB_to_hapA.asm5.paf
minimap2 -cx asm20 --eqx -t 16 hapA.fasta hapB.fasta > hapB_to_hapA.asm20.paf

# 2. Build a FASTA index of a concatenated hapA + hapB reference (for chr offsets)
cat hapA.fasta hapB.fasta > both.fasta && samtools faidx both.fasta

# 3. Call inversions
hapduo-detect hapB_to_hapA.asm5.paf --min-anchor 5000 --min-inv 50000 > breakpoints.tsv

# 4. Render the whole-genome synteny ideogram (asm20 gives fuller coverage)
hapduo-synteny --paf hapB_to_hapA.asm20.paf --fai both.fasta.fai \
               --chain --min-block 5000 --out-prefix figures/synteny

# 5. Render one multi-evidence panel per inversion
hapduo-batch --tsv breakpoints.tsv \
             --paf hapB_to_hapA.asm5.paf --cartoon-paf hapB_to_hapA.asm20.paf \
             --ccs ccs.sorted.bam --hic contacts.hic --hic-chrom assembly \
             --fai both.fasta.fai --outdir figures/

Quickstart (Snakemake)

git clone https://github.com/conchoecia/HapDuo.git
cd HapDuo
cp config.example.yaml config.yaml          # edit paths
snakemake --cores 16

The pipeline runs steps 1–5 above and writes everything under outdir/.

What does each figure show?

  • Synteny ideogram (hapduo-synteny): one row per chromosome pair, haplotype-A bar on top, haplotype-B bar on bottom (both at true Mb length), every chained PAF block overlaid as a polygon ribbon (blue + colinear, red inverted). Quickest way to see where the inversions are along the genome.

  • Inversion ideogram (hapduo-ideogram): single-haplotype chromosome bars with each inversion call drawn as a coloured rectangle, sized by inversion length. Counts per chromosome are summarised in a side panel.

  • Per-inversion multi-evidence panel (hapduo-style4): for a single inversion of interest, a single 180 × 228 mm panel showing

    1. Hi-C contact pyramid for the haplotype-A window (cells auto-aspected so they render as squares at any figure size).
    2. CCS read-depth track and read pile on haplotype-A, with reads spanning a 1 kb window around any breakpoint highlighted dark.
    3. Haplotype-A chromosome track with the two breakpoints marked and keyed by ① and ②.
    4. The ribbon panel itself with chained PAF polygons.
    5. Haplotype-B chromosome track with markers ③ and ④.
    6. CCS read pile + depth on haplotype-B.
    7. Mirrored Hi-C pyramid for haplotype-B.
    8. Top-right inset: chromosome-scale two-haplotype synteny cartoon with the inversion outlined as a black rectangle on both bars.
    9. Bottom row: four ±20 kb CCS read-pile zooms, one per breakpoint, keyed by the same ①②③④ digits used above.

Citing

A citation will be added here when the accompanying manuscript is public.

Releasing

HapDuo is published to PyPI via OIDC trusted publishing — no API tokens are stored in the repository. To cut a release:

git tag v0.1.0
git push --tags

The Release to PyPI workflow at .github/workflows/release.yml builds an sdist + wheel from the tagged commit, runs a smoke test against the five console scripts, publishes to PyPI, and attaches the dist files to a matching GitHub release.

The one-time setup on the PyPI side is documented at the top of that workflow file.

Legacy: DPGB dotplot pipeline

This repository was previously called DPGB (Dot Plot Genome Browser) and shipped a Snakemake pipeline for chained-PAF dot plots of CLR + CCS read mappings. That pipeline is preserved under archive/dpgb-dotplot/ for anyone still running it; it is not maintained.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hapduo-0.1.2.tar.gz (28.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hapduo-0.1.2-py3-none-any.whl (29.9 kB view details)

Uploaded Python 3

File details

Details for the file hapduo-0.1.2.tar.gz.

File metadata

  • Download URL: hapduo-0.1.2.tar.gz
  • Upload date:
  • Size: 28.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hapduo-0.1.2.tar.gz
Algorithm Hash digest
SHA256 c1fc3b82c62c15e988ad0263c06f5285f181e647e38e41b74002a19d5a87f36c
MD5 2ffad384a7ad8d622dbed1f9c13f9a1a
BLAKE2b-256 21c2cad50908d4121420327aeb11f520ede971c936ab0811eb3df6447d7895e7

See more details on using hashes here.

Provenance

The following attestation bundles were made for hapduo-0.1.2.tar.gz:

Publisher: release.yml on conchoecia/HapDuo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hapduo-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: hapduo-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 29.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hapduo-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 576e7086116d2f340a0ae0edcc3331d070fefed4ebb60530b82059fa2c357fe4
MD5 c74a2a74ab552ccafbf444c5823e32a0
BLAKE2b-256 641177da38edd8b7d0fa7d1f51d4ccb6fba8a764aaa68603a8e54587c09f0971

See more details on using hashes here.

Provenance

The following attestation bundles were made for hapduo-0.1.2-py3-none-any.whl:

Publisher: release.yml on conchoecia/HapDuo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page