Skip to main content

Ultra-fast strand-aware mutation counter

Project description

CountMut

Pypi Releases Downloads Development Status

Ultra-fast strand-aware mutation counter for bisulfite sequencing analysis

CountMut is a high-performance tool for counting mutations from bisulfite sequencing BAM files (BS-seq, CAM-seq, GLORI-seq, eTAM-seq). It features parallel processing, quality-based mate overlap deduplication, and optimized file I/O for maximum speed.

Features

  • 🚀 Ultra-Fast: Call mutation without pileup reads
  • 🧬 Bisulfite Support: NS, Zf, Yf tag filtering for conversion analysis
  • 🎯 Accurate: Quality-based mate overlap deduplication prevents double-counting
  • Parallel: Multi-threaded genomic window processing
  • 🔧 Flexible: Configurable filtering, strand-specific processing, auto-indexing

Installation

pip install countmut

Quick Start

# Basic usage - auto-creates indices if needed
countmut -i input.bam -r reference.fa -o mutations.tsv

# Count T→C mutations (common in bisulfite sequencing)
countmut -i input.bam -r reference.fa -o mutations.tsv --ref-base T --mut-base C

# With custom threads and filtering
countmut -i input.bam -r reference.fa -o mutations.tsv -t 8 --max-unc 5 --min-con 2

Options

Input/Output

-i, --input PATH       Input BAM file (coordinate-sorted) [required]
-r, --reference PATH   Reference FASTA file [required]
-o, --output PATH      Output TSV file (default: stdout)
-f, --force            Overwrite output without prompting

Mutation Analysis

--ref-base TEXT        Reference base to count from [default: A]
--mut-base TEXT        Mutation base to count [default: G]
--strand TEXT          Strand: both/forward/reverse [default: both]
--region TEXT          Genomic region (e.g., 'chr1:1000000-2000000')

Performance

-t, --threads INTEGER  Number of parallel threads [default: auto]
-b, --bin-size INTEGER Genomic bin size in bp [default: 10000]

Quality Filters

--min-baseq INTEGER    Min base quality (Phred score) [default: 20]
--min-mapq INTEGER     Min mapping quality (MAPQ) [default: 0]
--max-sub INTEGER      Max substitutions (NS tag) [default: 1]
--trim-start INTEGER   Trim 5' bases from reads [default: 2]
--trim-end INTEGER     Trim 3' bases from reads [default: 2]
--max-unc INTEGER      Max unconverted (Zf tag) [default: 3]
--min-con INTEGER      Min converted (Yf tag) [default: 1]

Output Records

-p, --pad INTEGER      Motif window half-size [default: 15]
-s, --save-rest        Include other bases (o0, o1, o2 columns)

Note: BAM files must have NS, Zf, and Yf tags (essential for bisulfite analysis). Indices (.bai, .fai) are created automatically if missing.

Output Format

TSV file with the following columns:

Column Description
chrom Chromosome name
pos Genomic position (1-based)
strand Strand (+ or -)
motif Sequence context (2×pad+1 bp window)
u0, u1, u2 Unconverted (reference base) counts
m0, m1, m2 Mutation (mutation base only) counts
o0, o1, o2 Other bases counts (with --save-rest)

Count categories (x0, x1, x2):

  • x0 (low quality): Bases failing quality filters (trim region, max-sub, min-mapq, min-baseq)
  • x1 (high conversion): Bases from reads with high conversion efficiency (low Zf and high Yf)
  • x2 (insufficient conversion): Bases from reads with insufficient conversion efficiency (high Zf or low Yf)

 

Copyright © 2025-present Chang Y

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

countmut-0.0.3.tar.gz (22.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

countmut-0.0.3-py3-none-any.whl (18.9 kB view details)

Uploaded Python 3

File details

Details for the file countmut-0.0.3.tar.gz.

File metadata

  • Download URL: countmut-0.0.3.tar.gz
  • Upload date:
  • Size: 22.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for countmut-0.0.3.tar.gz
Algorithm Hash digest
SHA256 dc7c66fed5cf836ba53ca3406b2918b2a50950cae7c18ed9d405a20c0c4fae99
MD5 d8f7eb9d0e16bd907410a1aa596057fe
BLAKE2b-256 39b76f29ff70fd8f61fa6ecc896c6e485338d1618ddaed3440f435ff812caef9

See more details on using hashes here.

Provenance

The following attestation bundles were made for countmut-0.0.3.tar.gz:

Publisher: publish.yml on y9c/countmut

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file countmut-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: countmut-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 18.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for countmut-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 30a3a8f0af561204b11058596ddba002778baa493898f17453c0e6d629dbf4dd
MD5 b59bf4794b95e00f32f960f084e1e8f0
BLAKE2b-256 38647421adefc9f4ca93b3c1d0c34f263169cf76082181d2248b13f8a99a2058

See more details on using hashes here.

Provenance

The following attestation bundles were made for countmut-0.0.3-py3-none-any.whl:

Publisher: publish.yml on y9c/countmut

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page