Skip to main content

Ultra-fast strand-aware mutation counter

Project description

CountMut

Pypi Releases Downloads Development Status

Ultra-fast strand-aware mutation counter for bisulfite sequencing analysis

CountMut is a high-performance tool for counting mutations from bisulfite sequencing BAM files (BS-seq, CAM-seq, GLORI-seq, eTAM-seq). It features parallel processing, quality-based mate overlap deduplication, and optimized file I/O for maximum speed.

Features

  • 🚀 Ultra-Fast: Call mutation without pileup reads
  • 🧬 Bisulfite Support: NS, Zf, Yf tag filtering for conversion analysis
  • 🎯 Accurate: Quality-based mate overlap deduplication prevents double-counting
  • Parallel: Multi-threaded genomic window processing
  • 🔧 Flexible: Configurable filtering, strand-specific processing, auto-indexing

Installation

pip install countmut

Quick Start

# Basic usage - auto-creates indices if needed
countmut -i input.bam -r reference.fa -o mutations.tsv

# Count T→C mutations (common in bisulfite sequencing)
countmut -i input.bam -r reference.fa -o mutations.tsv --ref-base T --mut-base C

# With custom threads and filtering
countmut -i input.bam -r reference.fa -o mutations.tsv -t 8 --max-unc 5 --min-con 2

Options

Input/Output

-i, --input PATH       Input BAM file (coordinate-sorted) [required]
-r, --reference PATH   Reference FASTA file [required]
-o, --output PATH      Output TSV file (default: stdout)
-f, --force            Overwrite output without prompting

Mutation Analysis

--ref-base TEXT        Reference base to count from [default: A]
--mut-base TEXT        Mutation base to count [default: G]
--strand TEXT          Strand: both/forward/reverse [default: both]
--region TEXT          Genomic region (e.g., 'chr1:1000000-2000000')

Performance

-t, --threads INTEGER  Number of parallel threads [default: auto]
-b, --bin-size INTEGER Genomic bin size in bp [default: 10000]

Alternative Mutation Tagging

--ref-base2 TEXT       Alternative reference base for tagging (e.g., 'C')
--mut-base2 TEXT       Alternative mutation base for tagging (e.g., 'T')
--output-bam PATH      Output BAM with alternative tags (Yc, Zc)

Quality Filters

--min-baseq INTEGER    Min base quality (Phred score) [default: 20]
--min-mapq INTEGER     Min mapping quality (MAPQ) [default: 0]
--max-sub INTEGER      Max substitutions (NS tag) [default: 1]
--trim-start INTEGER   Trim N bases from read 5' end (fragment orientation) [default: 2]
--trim-end INTEGER     Trim N bases from read 3' end (fragment orientation) [default: 2]
--max-unc INTEGER      Max unconverted (Zf tag) [default: 3]
--min-con INTEGER      Min converted (Yf tag) [default: 1]

Output Records

-p, --pad INTEGER      Motif window half-size [default: 15]
-s, --save-rest        Include other bases (o0, o1, o2 columns)

Note: BAM files must have NS, Zf, and Yf tags (essential for bisulfite analysis). Indices (.bai, .fai) are created automatically if missing.

Output Format

TSV file with the following columns:

Column Description
chrom Chromosome name
pos Genomic position (1-based)
strand Strand (+ or -)
motif Sequence context (2×pad+1 bp window)
u0, u1, u2 Unconverted (reference base) counts
m0, m1, m2 Mutation (mutation base only) counts
o0, o1, o2 Other bases counts (with --save-rest)

Count categories (x0, x1, x2):

  • x0 (low quality): Bases failing quality filters (trim region, max-sub, min-mapq, min-baseq)
  • x1 (insufficient conversion): Bases from reads with insufficient conversion efficiency (high Zf or low Yf)
  • x2 (high conversion): Bases from reads with high conversion efficiency (low Zf and high Yf)

Example Output

Without --save-rest:

chrom pos strand motif u0 u1 u2 m0 m1 m2
chr1 10000 + AAG 10 5 2 0 0 0
chr1 10001 - TTC 10 5 2 0 0 0

With --save-rest:

chrom pos strand motif u0 u1 u2 m0 m1 m2 o0 o1 o2
chr1 10000 + AAG 10 5 2 0 0 0 1 2 3
chr1 10001 - TTC 10 5 2 0 0 0 1 2 3

 

Copyright © 2025-present Chang Y

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

countmut-0.0.7.tar.gz (29.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

countmut-0.0.7-py3-none-any.whl (25.5 kB view details)

Uploaded Python 3

File details

Details for the file countmut-0.0.7.tar.gz.

File metadata

  • Download URL: countmut-0.0.7.tar.gz
  • Upload date:
  • Size: 29.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for countmut-0.0.7.tar.gz
Algorithm Hash digest
SHA256 e70cac94e17a19909c1d4d3104001df2cb42d21605afbe2696cb2453d656bbf1
MD5 a84a16aaf06e06473b40b431189edcc9
BLAKE2b-256 b72b91208e5155842896da9163624d70eb8cc4893f25b3633337394d8a9c388a

See more details on using hashes here.

Provenance

The following attestation bundles were made for countmut-0.0.7.tar.gz:

Publisher: publish.yml on y9c/countmut

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file countmut-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: countmut-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 25.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for countmut-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 28c535f4ff837e117268602c7a20c6157b349b3fc1b27ec56cf110585bf16dad
MD5 1764e9ef7293e66b0139644e2422fdea
BLAKE2b-256 7e36d789c98eed0b082c0c5148d82f9cc90212743996a519dff7099a2bf33658

See more details on using hashes here.

Provenance

The following attestation bundles were made for countmut-0.0.7-py3-none-any.whl:

Publisher: publish.yml on y9c/countmut

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page