Skip to main content

Multi-evidence visualisation of structural variants between two haplotype assemblies of the same individual

Project description

HapDuo

Multi-evidence visualisation of structural variants between two haplotype assemblies of the same individual.

HapDuo aligns two phased haplotype assemblies, detects inversions from the canonical +++ −−− +++ strand-switch pattern in a chained PAF, and renders a stack of supporting figures — including a per-inversion multi-evidence panel that combines Hi-C contact pyramids, CCS read piles, PAF synteny ribbons, a chromosome-scale cartoon, and close-up zoom panels around all four breakpoints, all on a single page.

Install

pip install hapduo

Requires Python ≥ 3.9, numpy, matplotlib, pysam, and hic-straw. The Snakemake pipeline additionally needs minimap2 and snakemake on PATH.

What you get

After install you have five command-line tools:

Command Output
hapduo-detect Inversion breakpoint TSV from a chained PAF
hapduo-style4 One multi-evidence per-inversion panel (180 × 228 mm, ≥ 6 pt text, TrueType — Illustrator-editable)
hapduo-batch All hapduo-style4 panels for one breakpoints TSV
hapduo-ideogram Single-haplotype ideogram of inversion calls
hapduo-synteny Whole-genome two-haplotype synteny ribbon ideogram

…plus a top-level Snakefile that runs the whole pipeline from a single config file.

Quickstart (manual)

# 1. Align hapB → hapA at two sensitivity levels
minimap2 -cx asm5  --eqx -t 16 hapA.fasta hapB.fasta > hapB_to_hapA.asm5.paf
minimap2 -cx asm20 --eqx -t 16 hapA.fasta hapB.fasta > hapB_to_hapA.asm20.paf

# 2. Build a FASTA index of a concatenated hapA + hapB reference (for chr offsets)
cat hapA.fasta hapB.fasta > both.fasta && samtools faidx both.fasta

# 3. Call inversions
hapduo-detect hapB_to_hapA.asm5.paf --min-anchor 5000 --min-inv 50000 > breakpoints.tsv

# 4. Render the whole-genome synteny ideogram (asm20 gives fuller coverage)
hapduo-synteny --paf hapB_to_hapA.asm20.paf --fai both.fasta.fai \
               --chain --min-block 5000 --out-prefix figures/synteny

# 5. Render one multi-evidence panel per inversion
hapduo-batch --tsv breakpoints.tsv \
             --paf hapB_to_hapA.asm5.paf --cartoon-paf hapB_to_hapA.asm20.paf \
             --ccs ccs.sorted.bam --hic contacts.hic --hic-chrom assembly \
             --fai both.fasta.fai --outdir figures/

Quickstart (Snakemake)

git clone https://github.com/conchoecia/HapDuo.git
cd HapDuo
cp config.example.yaml config.yaml          # edit paths
snakemake --cores 16

The pipeline runs steps 1–5 above and writes everything under outdir/.

What does each figure show?

  • Synteny ideogram (hapduo-synteny): one row per chromosome pair, haplotype-A bar on top, haplotype-B bar on bottom (both at true Mb length), every chained PAF block overlaid as a polygon ribbon (blue + colinear, red inverted). Quickest way to see where the inversions are along the genome.

  • Inversion ideogram (hapduo-ideogram): single-haplotype chromosome bars with each inversion call drawn as a coloured rectangle, sized by inversion length. Counts per chromosome are summarised in a side panel.

  • Per-inversion multi-evidence panel (hapduo-style4): for a single inversion of interest, a single 180 × 228 mm panel showing

    1. Hi-C contact pyramid for the haplotype-A window (cells auto-aspected so they render as squares at any figure size).
    2. CCS read-depth track and read pile on haplotype-A, with reads spanning a 1 kb window around any breakpoint highlighted dark.
    3. Haplotype-A chromosome track with the two breakpoints marked and keyed by ① and ②.
    4. The ribbon panel itself with chained PAF polygons.
    5. Haplotype-B chromosome track with markers ③ and ④.
    6. CCS read pile + depth on haplotype-B.
    7. Mirrored Hi-C pyramid for haplotype-B.
    8. Top-right inset: chromosome-scale two-haplotype synteny cartoon with the inversion outlined as a black rectangle on both bars.
    9. Bottom row: four ±20 kb CCS read-pile zooms, one per breakpoint, keyed by the same ①②③④ digits used above.

Citing

A citation will be added here when the accompanying manuscript is public.

Releasing

HapDuo is published to PyPI via OIDC trusted publishing — no API tokens are stored in the repository. To cut a release:

git tag v0.1.0
git push --tags

The Release to PyPI workflow at .github/workflows/release.yml builds an sdist + wheel from the tagged commit, runs a smoke test against the five console scripts, publishes to PyPI, and attaches the dist files to a matching GitHub release.

The one-time setup on the PyPI side is documented at the top of that workflow file.

Legacy: DPGB dotplot pipeline

This repository was previously called DPGB (Dot Plot Genome Browser) and shipped a Snakemake pipeline for chained-PAF dot plots of CLR + CCS read mappings. That pipeline is preserved under archive/dpgb-dotplot/ for anyone still running it; it is not maintained.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hapduo-0.1.1.tar.gz (28.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hapduo-0.1.1-py3-none-any.whl (29.6 kB view details)

Uploaded Python 3

File details

Details for the file hapduo-0.1.1.tar.gz.

File metadata

  • Download URL: hapduo-0.1.1.tar.gz
  • Upload date:
  • Size: 28.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hapduo-0.1.1.tar.gz
Algorithm Hash digest
SHA256 d0c802cb8b3ac746304a743097ad0818290391a38e4c6f41a37bf43cd84e0873
MD5 524c3387fe766d8c9333e8e80ca1dacd
BLAKE2b-256 944be039470a48c0793170e86f40e608eaa6c53756d986b7745f3c50d6113381

See more details on using hashes here.

Provenance

The following attestation bundles were made for hapduo-0.1.1.tar.gz:

Publisher: release.yml on conchoecia/HapDuo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hapduo-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: hapduo-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 29.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hapduo-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8dbb5aaf1936c4ab175b12521261664c2ca64dcd196fd271f604ba6ae33ebe6e
MD5 6ef0288e042c30ea30d16b024185bacd
BLAKE2b-256 ba58733bebe4bf9f3220196bb80a44099e98d12b10f0c3975705cd5c88fb76d8

See more details on using hashes here.

Provenance

The following attestation bundles were made for hapduo-0.1.1-py3-none-any.whl:

Publisher: release.yml on conchoecia/HapDuo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page