Skip to main content

Ultra-fast strand-aware mutation counter

Project description

CountMut

Pypi Releases Downloads Development Status

Ultra-fast strand-aware mutation counter for bisulfite sequencing analysis

CountMut is a high-performance tool for counting mutations from bisulfite sequencing BAM files (BS-seq, CAM-seq, GLORI-seq, eTAM-seq). It features parallel processing, quality-based mate overlap deduplication, and optimized file I/O for maximum speed.

Features

  • 🚀 Ultra-Fast: Call mutation without pileup reads
  • 🧬 Bisulfite Support: NS, Zf, Yf tag filtering for conversion analysis
  • 🎯 Accurate: Quality-based mate overlap deduplication prevents double-counting
  • Parallel: Multi-threaded genomic window processing
  • 🔧 Flexible: Configurable filtering, strand-specific processing, auto-indexing

Installation

pip install countmut

Quick Start

# Basic usage - auto-creates indices if needed
countmut -i input.bam -r reference.fa -o mutations.tsv

# Count T→C mutations (common in bisulfite sequencing)
countmut -i input.bam -r reference.fa -o mutations.tsv --ref-base T --mut-base C

# With custom threads and filtering
countmut -i input.bam -r reference.fa -o mutations.tsv -t 8 --max-unc 5 --min-con 2

Options

Input/Output

-i, --input PATH       Input BAM file (coordinate-sorted) [required]
-r, --reference PATH   Reference FASTA file [required]
-o, --output PATH      Output TSV file (default: stdout)
-f, --force            Overwrite output without prompting

Mutation Analysis

--ref-base TEXT        Reference base to count from [default: A]
--mut-base TEXT        Mutation base to count [default: G]
--strand TEXT          Strand: both/forward/reverse [default: both]
--region TEXT          Genomic region (e.g., 'chr1:1000000-2000000')

Performance

-t, --threads INTEGER  Number of parallel threads [default: auto]
-b, --bin-size INTEGER Genomic bin size in bp [default: 10000]

Alternative Mutation Tagging

--ref-base2 TEXT       Alternative reference base for tagging (e.g., 'C')
--mut-base2 TEXT       Alternative mutation base for tagging (e.g., 'T')
--output-bam PATH      Output BAM with alternative tags (Yc, Zc)

Quality Filters

--min-baseq INTEGER    Min base quality (Phred score) [default: 20]
--min-mapq INTEGER     Min mapping quality (MAPQ) [default: 0]
--max-sub INTEGER      Max substitutions (NS tag) [default: 1]
--trim-start INTEGER   Trim N bases from read 5' end (fragment orientation) [default: 2]
--trim-end INTEGER     Trim N bases from read 3' end (fragment orientation) [default: 2]
--max-unc INTEGER      Max unconverted (Zf tag) [default: 3]
--min-con INTEGER      Min converted (Yf tag) [default: 1]

Output Records

-p, --pad INTEGER      Motif window half-size [default: 15]
-s, --save-rest        Include other bases (o0, o1, o2 columns)

Note: BAM files must have NS, Zf, and Yf tags (essential for bisulfite analysis). Indices (.bai, .fai) are created automatically if missing.

Output Format

TSV file with the following columns:

Column Description
chrom Chromosome name
pos Genomic position (1-based)
strand Strand (+ or -)
motif Sequence context (2×pad+1 bp window)
u0, u1, u2 Unconverted (reference base) counts
m0, m1, m2 Mutation (mutation base only) counts
o0, o1, o2 Other bases counts (with --save-rest)

Count categories (x0, x1, x2):

  • x0 (low quality): Bases failing quality filters (trim region, max-sub, min-mapq, min-baseq)
  • x1 (insufficient conversion): Bases from reads with insufficient conversion efficiency (high Zf or low Yf)
  • x2 (high conversion): Bases from reads with high conversion efficiency (low Zf and high Yf)

Example Output

Without --save-rest:

chrom pos strand motif u0 u1 u2 m0 m1 m2
chr1 10000 + AAG 10 5 2 0 0 0
chr1 10001 - TTC 10 5 2 0 0 0

With --save-rest:

chrom pos strand motif u0 u1 u2 m0 m1 m2 o0 o1 o2
chr1 10000 + AAG 10 5 2 0 0 0 1 2 3
chr1 10001 - TTC 10 5 2 0 0 0 1 2 3

 

Copyright © 2025-present Chang Y

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

countmut-0.0.8.tar.gz (30.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

countmut-0.0.8-py3-none-any.whl (26.5 kB view details)

Uploaded Python 3

File details

Details for the file countmut-0.0.8.tar.gz.

File metadata

  • Download URL: countmut-0.0.8.tar.gz
  • Upload date:
  • Size: 30.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for countmut-0.0.8.tar.gz
Algorithm Hash digest
SHA256 85551700a653330158a29317ab40ec55df3e3f1ac15c92b14b02211ef5f252ba
MD5 0a6a3301801ad5666893644a64e163f5
BLAKE2b-256 0690a576a14bf34913d9517a89f361f1ce8655100821c1ec5342caff4d962d74

See more details on using hashes here.

Provenance

The following attestation bundles were made for countmut-0.0.8.tar.gz:

Publisher: publish.yml on y9c/countmut

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file countmut-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: countmut-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 26.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for countmut-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 546cbc1a42fb685004d412570a77263c9e711bed34e9c843c05387d7334ac79e
MD5 7f49bc5412ee506c1a4e7cf75d5a6239
BLAKE2b-256 d303cdce83b4dfba9f406e835f862fcf0c0cc1a7cf03f3592395b22c7569855d

See more details on using hashes here.

Provenance

The following attestation bundles were made for countmut-0.0.8-py3-none-any.whl:

Publisher: publish.yml on y9c/countmut

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page