Skip to main content

A package for haplotype analysis from BAM files

Project description

Hasan: Haplotype Analysis from BAM files

Hasan Workflow

Hasan is a Python package for analyzing haplotypes from BAM files using SNP information. It constructs directed acyclic graphs (DAGs) to identify and visualize potential haplotypes based on sequencing data.

Features

  • Read and process SNP information from TSV files
  • Convert VCF files to compatible TSV format
  • Build phasing tables from BAM files
  • Create directed acyclic graphs (DAGs) for haplotype visualization
  • Find and analyze potential haplotypes
  • Interactive graph visualization with draggable nodes
  • Command-line interface with rich output formatting

Installation

pip install hasan

Requirements

  • Python ≥ 3.6
  • pysam
  • pandas
  • networkx
  • matplotlib
  • click
  • rich

Usage

Command Line Interface

The package provides two main commands:

  1. Analyze haplotypes:
hasan analyze <bam_file> <snps_file> [options]

Options:

  • --plot/--no-plot: Enable/disable interactive plot visualization
  • --output/-o: Specify output TSV file for haplotype results
  • --verbose/-v: Print detailed progress information

Example:

hasan analyze sample.bam variants.tsv --plot --output results.tsv --verbose
  1. Convert VCF to TSV:
hasan convert <input_vcf> <output_tsv> [options]

Options:

  • --verbose/-v: Print detailed progress information

Example:

hasan convert variants.vcf variants.tsv --verbose

Input File Formats

SNPs File (TSV format)

CHROM   POS     REF     ALT     QUAL    DP
chr1    1000    A       G       40      20
chr1    1500    C       T       35      15

Note: When converting from VCF, variants are filtered to:

  • Exclude indels (only SNPs are kept)
  • Require minimum quality score (QUAL ≥ 30)
  • Require minimum depth (DP ≥ 10)

Python API

from hasan import read_snps, build_phasing_table, create_dag, find_haplotypes

# Read SNP information
snps_df = read_snps("variants.tsv")

# Build phasing table
phasing_data = build_phasing_table("sample.bam", snps_df)

# Create graph
G = create_dag(phasing_data, snps_df)

# Find haplotypes
haplotypes = find_haplotypes(G)

Output

The package provides multiple output formats:

  1. Interactive visualization (when using --plot)
  2. Static graph image (haplotype_graph.png)
  3. TSV file with haplotype frequencies (when using --output)
  4. Rich console output showing haplotype proportions

How It Works

  1. SNP Reading: Loads SNP positions and variants from a TSV file.
  2. Phasing Table: Processes BAM file to count base occurrences at SNP positions.
  3. Graph Construction: Creates a DAG where:
    • Nodes represent bases at each position
    • Edges represent connections between consecutive positions
    • Edge weights represent proportion of reads supporting the connection
  4. Haplotype Finding: Identifies possible haplotypes by finding paths through the graph.

Visualization

The interactive visualization allows you to:

  • Drag nodes to rearrange the graph
  • View edge weights representing read proportions
  • Distinguish between reference (green) and alternate (blue) bases

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hasan-0.1.0.tar.gz (10.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hasan-0.1.0-py3-none-any.whl (10.9 kB view details)

Uploaded Python 3

File details

Details for the file hasan-0.1.0.tar.gz.

File metadata

  • Download URL: hasan-0.1.0.tar.gz
  • Upload date:
  • Size: 10.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.2

File hashes

Hashes for hasan-0.1.0.tar.gz
Algorithm Hash digest
SHA256 93dea40464e89ea1d232421a93604e09f09fa76f70ed0ca166b87574b3b69f2f
MD5 6d0f840dfc381ca71f2f8685ea6c2082
BLAKE2b-256 559543e5a9b0b1d541eaf5bb1f91674b1461d33dd192def4901aed22e7502ea8

See more details on using hashes here.

File details

Details for the file hasan-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: hasan-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 10.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.2

File hashes

Hashes for hasan-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0e3d3ed44ba8c7784bdf2c48336d373075d55f319cb7f205c3ca1b691cf249c6
MD5 6c79a428e792a723805b3b3af13df5a1
BLAKE2b-256 7f8a3b8decf231a5490a472a0ff9b6d25107ded533fb8785f75b3671c4dd0c8f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page