Skip to main content

Ultra-fast strand-aware mutation counter

Project description

CountMut

Pypi Releases Downloads Development Status

Ultra-fast strand-aware mutation counter

CountMut counts mutations from bisulfite sequencing / CAM-seq / GLORI-seq / eTAM-seq BAM files with parallel processing, quality-based mate overlap deduplication, and optimized file I/O.

Features

  • 🚀 Ultra-Fast: Direct FASTA index reading, shared file handles, BGZF multi-threading
  • 🧬 Bisulfite Support: NS, Zf, Yf tag filtering for conversion analysis
  • 🎯 Accurate: Quality-based mate overlap deduplication prevents double-counting
  • Parallel: Multi-threaded genomic window processing
  • 🔧 Flexible: Configurable filtering, strand-specific processing, auto-indexing

Installation

pip install countmut

Quick Start

# Basic usage - auto-creates indices if needed
countmut -i input.bam -r reference.fa -o mutations.tsv

# Count T→C mutations (common in bisulfite sequencing)
countmut -i input.bam -r reference.fa -o mutations.tsv --ref-base T --mut-base C

# With custom threads and filtering
countmut -i input.bam -r reference.fa -o mutations.tsv -t 8 --max-unc 5 --min-con 2

Key Options

Required:
  -i, --input PATH           Input BAM file
  -r, --reference PATH       Reference FASTA file

Output:
  -o, --output PATH          Output TSV file (default: stdout)

Mutation:
  --ref-base TEXT            Reference base [default: A]
  --mut-base TEXT            Mutation base [default: G]
  --strand [both|forward|reverse]  Strand processing [default: both]
  --region TEXT              Specific region (e.g., 'chr1:1000000-2000000')

Performance:
  -t, --threads INTEGER      Number of threads [default: auto]
  -b, --bin-size INTEGER     Genomic bin size [default: 10000]

Filtering (Bisulfite):
  --pad INTEGER              Motif window padding [default: 15]
  --trim-start INTEGER       Trim 5' bases [default: 2]
  --trim-end INTEGER         Trim 3' bases [default: 2]
  --max-unc INTEGER          Max unconverted (Zf) [default: 3]
  --min-con INTEGER          Min converted (Yf) [default: 1]
  --max-sub INTEGER          Max substitutions (NS) [default: 1]
  --min-baseq INTEGER        Min base quality (Phred) [default: 20]
  --min-mapq INTEGER         Min mapping quality (MAPQ) [default: 0]

Note: BAM must have NS, Zf, and Yf tags for bisulfite analysis.

Output Format

TSV file with columns:

  • chrom, pos, strand, motif - Position and sequence context
  • u0, u1, u2 - Unconverted (reference base) counts [drop, clean, unconverted]
  • m0, m1, m2 - Mutation (mutation base only) counts [drop, clean, unconverted]
  • o0, o1, o2 - Other bases counts [drop, clean, unconverted] (only with --save-rest)

Where:

  • drop (x0): Bases failing quality filters (internal position, mismatch, mapq, baseq)
  • clean (x1): High-quality bases passing all filters
  • unconverted (x2): Bases in unconverted reads (high Zf or low Yf)

 

Copyright © 2025-present Chang Y

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

countmut-0.0.2.tar.gz (21.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

countmut-0.0.2-py3-none-any.whl (18.6 kB view details)

Uploaded Python 3

File details

Details for the file countmut-0.0.2.tar.gz.

File metadata

  • Download URL: countmut-0.0.2.tar.gz
  • Upload date:
  • Size: 21.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for countmut-0.0.2.tar.gz
Algorithm Hash digest
SHA256 3d6d94ef0590ee16889608c1cb8f231f8a0850e30264926301be12e1a0c41c3f
MD5 3ed0678cde8995a7a8b07fc4c3f933e4
BLAKE2b-256 2b2f29a3a263026605f8104266ac3336bdce0cd91c35adef4baf39feb092dff9

See more details on using hashes here.

Provenance

The following attestation bundles were made for countmut-0.0.2.tar.gz:

Publisher: publish.yml on y9c/countmut

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file countmut-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: countmut-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 18.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for countmut-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 26cf8f50d2683a5815f065e6984b6f4fd8979e22f5d26390709d7b9ad95922c8
MD5 b305529d67bb365abad8a146c0be5f82
BLAKE2b-256 13635b2a97ce68c5a8d77372bbee1fa563bc2f21a6713f631f6807a0123433e6

See more details on using hashes here.

Provenance

The following attestation bundles were made for countmut-0.0.2-py3-none-any.whl:

Publisher: publish.yml on y9c/countmut

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page