Skip to main content

Ultra-fast strand-aware mutation counter

Project description

CountMut

Pypi Releases Downloads Development Status

Ultra-fast strand-aware mutation counter for bisulfite sequencing analysis

CountMut is a high-performance tool for counting mutations from bisulfite sequencing BAM files (BS-seq, CAM-seq, GLORI-seq, eTAM-seq). It features parallel processing, quality-based mate overlap deduplication, and optimized file I/O for maximum speed.

Features

  • 🚀 Ultra-Fast: Call mutation without pileup reads
  • 🧬 Bisulfite Support: NS, Zf, Yf tag filtering for conversion analysis
  • 🎯 Accurate: Quality-based mate overlap deduplication prevents double-counting
  • Parallel: Multi-threaded genomic window processing
  • 🔧 Flexible: Configurable filtering, strand-specific processing, auto-indexing

Installation

pip install countmut

Quick Start

# Basic usage - auto-creates indices if needed
countmut -i input.bam -r reference.fa -o mutations.tsv

# Count T→C mutations (common in bisulfite sequencing)
countmut -i input.bam -r reference.fa -o mutations.tsv --ref-base T --mut-base C

# With custom threads and filtering
countmut -i input.bam -r reference.fa -o mutations.tsv -t 8 --max-unc 5 --min-con 2

Options

Input/Output

-i, --input PATH       Input BAM file (coordinate-sorted) [required]
-r, --reference PATH   Reference FASTA file [required]
-o, --output PATH      Output TSV file (default: stdout)
-f, --force            Overwrite output without prompting

Mutation Analysis

--ref-base TEXT        Reference base to count from [default: A]
--mut-base TEXT        Mutation base to count [default: G]
--strand TEXT          Strand: both/forward/reverse [default: both]
--region TEXT          Genomic region (e.g., 'chr1:1000000-2000000')

Performance

-t, --threads INTEGER  Number of parallel threads [default: auto]
-b, --bin-size INTEGER Genomic bin size in bp [default: 10000]

Alternative Mutation Tagging

--ref-base2 TEXT       Alternative reference base for tagging (e.g., 'C')
--mut-base2 TEXT       Alternative mutation base for tagging (e.g., 'T')
--output-bam PATH      Output BAM with alternative tags (Yc, Zc)

Quality Filters

--min-baseq INTEGER    Min base quality (Phred score) [default: 20]
--min-mapq INTEGER     Min mapping quality (MAPQ) [default: 0]
--max-sub INTEGER      Max substitutions (NS tag) [default: 1]
--trim-start INTEGER   Trim N bases from read 5' end (fragment orientation) [default: 2]
--trim-end INTEGER     Trim N bases from read 3' end (fragment orientation) [default: 2]
--max-unc INTEGER      Max unconverted (Zf tag) [default: 3]
--min-con INTEGER      Min converted (Yf tag) [default: 1]

Output Records

-p, --pad INTEGER      Motif window half-size [default: 15]
-s, --save-rest        Include other bases (o0, o1, o2 columns)

Note: BAM files must have NS, Zf, and Yf tags (essential for bisulfite analysis). Indices (.bai, .fai) are created automatically if missing.

Output Format

TSV file with the following columns:

Column Description
chrom Chromosome name
pos Genomic position (1-based)
strand Strand (+ or -)
motif Sequence context (2×pad+1 bp window)
u0, u1, u2 Unconverted (reference base) counts
m0, m1, m2 Mutation (mutation base only) counts
o0, o1, o2 Other bases counts (with --save-rest)

Count categories (x0, x1, x2):

  • x0 (low quality): Bases failing quality filters (trim region, max-sub, min-mapq, min-baseq)
  • x1 (high conversion): Bases from reads with high conversion efficiency (low Zf and high Yf)
  • x2 (insufficient conversion): Bases from reads with insufficient conversion efficiency (high Zf or low Yf)

 

Copyright © 2025-present Chang Y

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

countmut-0.0.5.tar.gz (26.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

countmut-0.0.5-py3-none-any.whl (23.0 kB view details)

Uploaded Python 3

File details

Details for the file countmut-0.0.5.tar.gz.

File metadata

  • Download URL: countmut-0.0.5.tar.gz
  • Upload date:
  • Size: 26.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for countmut-0.0.5.tar.gz
Algorithm Hash digest
SHA256 c03e360ef462e76c308bda3b2dd2107884b74958fef5aae95abe63075518e394
MD5 2057a540b7c9c1857e9faa1b7a91f2dd
BLAKE2b-256 1f840a772e38e20b6fea9b7e112caaffe8bfc820ac31ee85b755c8237e47b786

See more details on using hashes here.

Provenance

The following attestation bundles were made for countmut-0.0.5.tar.gz:

Publisher: publish.yml on y9c/countmut

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file countmut-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: countmut-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 23.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for countmut-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 94555389e4ee37e127ad98cf6274a561fa6c891eb42226d774567f9974afbf62
MD5 2964b60f919d8fad1d1c38e5c5f6ac1b
BLAKE2b-256 7effd71876b786b3001af9b4e1d27d22852d753c5b347f00551d256f582aaf30

See more details on using hashes here.

Provenance

The following attestation bundles were made for countmut-0.0.5-py3-none-any.whl:

Publisher: publish.yml on y9c/countmut

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page