Ultra-fast strand-aware mutation counter
Project description
CountMut
Ultra-fast strand-aware mutation counter
CountMut counts mutations from bisulfite sequencing / CAM-seq / GLORI-seq / eTAM-seq BAM files with parallel processing, quality-based mate overlap deduplication, and optimized file I/O.
Features
- 🚀 Ultra-Fast: Direct FASTA index reading, shared file handles, BGZF multi-threading
- 🧬 Bisulfite Support: NS, Zf, Yf tag filtering for conversion analysis
- 🎯 Accurate: Quality-based mate overlap deduplication prevents double-counting
- ⚡ Parallel: Multi-threaded genomic window processing
- 🔧 Flexible: Configurable filtering, strand-specific processing, auto-indexing
Installation
pip install countmut
Quick Start
# Basic usage - auto-creates indices if needed
countmut -i input.bam -r reference.fa -o mutations.tsv
# Count T→C mutations (common in bisulfite sequencing)
countmut -i input.bam -r reference.fa -o mutations.tsv --ref-base T --mut-base C
# With custom threads and filtering
countmut -i input.bam -r reference.fa -o mutations.tsv -t 8 --max-unc 5 --min-con 2
Key Options
Required:
-i, --input PATH Input BAM file
-r, --reference PATH Reference FASTA file
Output:
-o, --output PATH Output TSV file (default: stdout)
Mutation:
--ref-base TEXT Reference base [default: A]
--mut-base TEXT Mutation base [default: G]
--strand [both|forward|reverse] Strand processing [default: both]
--region TEXT Specific region (e.g., 'chr1:1000000-2000000')
Performance:
-t, --threads INTEGER Number of threads [default: auto]
-b, --bin-size INTEGER Genomic bin size [default: 10000]
Filtering (Bisulfite):
--pad INTEGER Motif window padding [default: 15]
--trim-start INTEGER Trim 5' bases [default: 2]
--trim-end INTEGER Trim 3' bases [default: 2]
--max-unc INTEGER Max unconverted (Zf) [default: 3]
--min-con INTEGER Min converted (Yf) [default: 1]
--max-sub INTEGER Max substitutions (NS) [default: 1]
--min-baseq INTEGER Min base quality (Phred) [default: 20]
--min-mapq INTEGER Min mapping quality (MAPQ) [default: 0]
Note: BAM must have NS, Zf, and Yf tags for bisulfite analysis.
Output Format
TSV file with columns:
chrom,pos,strand,motif- Position and sequence contextu0,u1,u2- Unconverted (reference base) counts [drop, clean, unconverted]m0,m1,m2- Mutation (mutation base only) counts [drop, clean, unconverted]o0,o1,o2- Other bases counts [drop, clean, unconverted] (only with--save-rest)
Where:
- drop (x0): Bases failing quality filters (internal position, mismatch, mapq, baseq)
- clean (x1): High-quality bases passing all filters
- unconverted (x2): Bases in unconverted reads (high Zf or low Yf)
Copyright © 2025-present Chang Y
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file countmut-0.0.2.tar.gz.
File metadata
- Download URL: countmut-0.0.2.tar.gz
- Upload date:
- Size: 21.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3d6d94ef0590ee16889608c1cb8f231f8a0850e30264926301be12e1a0c41c3f
|
|
| MD5 |
3ed0678cde8995a7a8b07fc4c3f933e4
|
|
| BLAKE2b-256 |
2b2f29a3a263026605f8104266ac3336bdce0cd91c35adef4baf39feb092dff9
|
Provenance
The following attestation bundles were made for countmut-0.0.2.tar.gz:
Publisher:
publish.yml on y9c/countmut
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
countmut-0.0.2.tar.gz -
Subject digest:
3d6d94ef0590ee16889608c1cb8f231f8a0850e30264926301be12e1a0c41c3f - Sigstore transparency entry: 634633515
- Sigstore integration time:
-
Permalink:
y9c/countmut@07e07fda806db82f2d218d7c8cb3dab22ad90ee6 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/y9c
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@07e07fda806db82f2d218d7c8cb3dab22ad90ee6 -
Trigger Event:
push
-
Statement type:
File details
Details for the file countmut-0.0.2-py3-none-any.whl.
File metadata
- Download URL: countmut-0.0.2-py3-none-any.whl
- Upload date:
- Size: 18.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
26cf8f50d2683a5815f065e6984b6f4fd8979e22f5d26390709d7b9ad95922c8
|
|
| MD5 |
b305529d67bb365abad8a146c0be5f82
|
|
| BLAKE2b-256 |
13635b2a97ce68c5a8d77372bbee1fa563bc2f21a6713f631f6807a0123433e6
|
Provenance
The following attestation bundles were made for countmut-0.0.2-py3-none-any.whl:
Publisher:
publish.yml on y9c/countmut
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
countmut-0.0.2-py3-none-any.whl -
Subject digest:
26cf8f50d2683a5815f065e6984b6f4fd8979e22f5d26390709d7b9ad95922c8 - Sigstore transparency entry: 634633524
- Sigstore integration time:
-
Permalink:
y9c/countmut@07e07fda806db82f2d218d7c8cb3dab22ad90ee6 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/y9c
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@07e07fda806db82f2d218d7c8cb3dab22ad90ee6 -
Trigger Event:
push
-
Statement type: