Skip to main content

Population-aware, MST-based haplotype graph analysis and visualization in Python.

Project description

HapNet

HapNet is a lightweight Python command-line package for constructing population-aware, minimum-spanning-tree-based haplotype graphs from aligned FASTA files.

HapNet is designed for reproducible single-locus workflows. It collapses identical aligned sequences into haplotypes, calculates pairwise Hamming distances among haplotypes, constructs a minimum spanning tree (MST), plots a population-aware haplotype graph, and writes machine-readable TSV summaries.

Important scope note: HapNet constructs an MST-based haplotype graph. It does not infer median-joining, statistical-parsimony, or reticulate haplotype networks.

Installation

pip install hapnet

Standard input format

By default, population identity is parsed from the last underscore-delimited FASTA header token:

>Ind01_NK
ACTGACTG
>Ind02_RI
ACTGATTA

Run:

hapnet examples/basic/basic.fasta --out basic.svg --log-prefix basic

Phased diploid input

HapNet v0.2.0 adds optional support for already phased diploid sequences. HapNet does not infer phase; it preserves individual identity across phased allele copies that were generated by external software.

Header format in phased mode:

>Ind01_a_NK
ACTGACTG
>Ind01_b_NK
ACTGATTA

Run:

hapnet examples/phased/phased_example.fasta --phased --out phased.svg --log-prefix phased

This writes an additional individual-level genotype table:

phased_individual_genotypes.tsv

Metadata file option

Instead of encoding population and allele information in headers, users can provide a tab-delimited metadata file:

sequence_id	individual_id	allele	population
Ind01_a	Ind01	a	NK
Ind01_b	Ind01	b	NK

Run:

hapnet input.fasta --metadata metadata.tsv --phased --out network.svg --log-prefix run1

Main output files

For --log-prefix run1, HapNet writes:

  • run1_haplotypes.tsv: haplotype IDs, frequencies, populations, and sequences
  • run1_membership.tsv: sequence-to-haplotype membership
  • run1_shared_haplotypes.tsv: haplotypes shared among populations
  • run1_haplotype_individuals.tsv: individuals represented in each haplotype
  • run1_summary.tsv: summary statistics
  • run1_run_metadata.tsv: run metadata for reproducibility
  • run1_individual_genotypes.tsv: phased individual genotype table, written only with --phased

Ambiguous and missing data

By default, ambiguous characters and gaps are treated as literal character states during Hamming-distance calculation. To ignore positions containing N, ?, -, or . in either sequence during pairwise comparisons, use:

hapnet input.fasta --ignore-ambiguous --out network.svg --log-prefix run1

Scripted workflow example

HapNet can be integrated into a shell workflow that processes many aligned FASTA files:

for fasta in alignments/*.fasta
do
    base=$(basename "$fasta" .fasta)
    hapnet "$fasta" \
      --out "networks/${base}.svg" \
      --log-prefix "results/${base}"
done

Citation

If you use HapNet, please cite the associated manuscript when available.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hapnet-0.2.0.tar.gz (14.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hapnet-0.2.0-py3-none-any.whl (15.2 kB view details)

Uploaded Python 3

File details

Details for the file hapnet-0.2.0.tar.gz.

File metadata

  • Download URL: hapnet-0.2.0.tar.gz
  • Upload date:
  • Size: 14.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.18

File hashes

Hashes for hapnet-0.2.0.tar.gz
Algorithm Hash digest
SHA256 bda3e0e97c14de08f6eaf8465e0b7183b6f5190a35426f4a136a472fb99c90ef
MD5 110c93f8453e0f110cee5083b5286c83
BLAKE2b-256 6e8396f9350a29664a135d582d8d425ee6a76258572b4d03dcbdc7c1acd9596a

See more details on using hashes here.

File details

Details for the file hapnet-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: hapnet-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 15.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.18

File hashes

Hashes for hapnet-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2e190db3ad95e56727b37b35a6ce06460b31c6ed6c1105afd2d4d7c61e526f79
MD5 3ba9b9974d126c62808ba8fe7267b1ef
BLAKE2b-256 46f660c12cf5a3d0977d8056f7970519be95b36c74dcf752fce99cf64848212b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page