Skip to main content

A Rust-based BAM depth calculator for Python.

Project description

🦀 rustbam - Rust-powered fast BAM depth extraction with Python bindings

rustbam is a high-performance BAM depth calculator written in Rust, with Python bindings for fast and efficient genomic data analysis.

📦 Installation

Install from PyPI (No Conda Required)

You can install rustbam directly with pip:

pip install rustbam

🛠️ Usage

Python API

After installation, you can use rustbam in Python:

import rustbam

positions, depths = rustbam.get_depths(
    bam_path,         # path to bam file
    chromosome,       # chromosome/contig name
    start,            # 1-based inclusive start coordinate
    end,              # 1-based inclusive end coordinate
    step=10,          # step as in range(start, end, step) - default: 1
    min_mapq=0,       # minimum mapping quality - default 0
    min_bq=13,        # minimum base quality - default 13 (as in samtools mpileup)
    max_depth=8000,   # maximum depth to return per base position
    num_threads=12,   # number of threads for parallelization
)

print(positions[:5])  # e.g. [100000, 100010, 100020, 100030, 100040]
print(depths[:5])     # e.g. [12, 15, 10, 8, 20]

CLI (Command Line Interface)

After installation, you can use rustbam in your shell (note that coordinates are 1-based and inclusive, as in samtools mpileup):

$ rustbam --help
usage: rustbam [-h] [-t STEP] [-Q MIN_MAPQ] [-q MIN_BQ] [-d MAX_DEPTH] [-n NUM_THREADS] [-j] bam chromosome start end

Compute sequencing depth from a BAM file.

positional arguments:
  bam                   Path to the indexed BAM file
  chromosome            Chromosome name (e.g., 'chr1')
  start                 Start position (1-based)
  end                   End position (1-based)

options:
  -h, --help            show this help message and exit
  -t STEP, --step STEP  Step size for sampling positions (default: 1)
  -Q MIN_MAPQ, --min_mapq MIN_MAPQ
                        Minimum mapping quality (default: 0)
  -q MIN_BQ, --min_bq MIN_BQ
                        Minimum base quality (default: 13)
  -d MAX_DEPTH, --max_depth MAX_DEPTH
                        Maximum depth allowed (default: 8000)
  -n NUM_THREADS, --num_threads NUM_THREADS
                        Number of threads (default: 12)
  -j, --json            Output results in JSON format

An example usage of the CLI:

$ rustbam tests/example.bam chr1 1000000 1000005
1000000 51
1000001 52
1000002 44
1000003 52
1000004 53
1000005 47

You can get much faster depths result compared to samtools mpileup (as long as you use the multithreading option, -n):

$ time samtools mpileup /path/to/a/large/bam -r chr1:1-30000000 > /dev/null
[mpileup] 1 samples in 1 input files

real    0m52.897s
user    0m52.270s
sys     0m0.436s

$ time rustbam /path/to/a/large/bam chr1 1 30000000 -n 12 > /dev/null

real    0m18.725s
user    0m50.806s
sys     0m6.303s

Don't even get me started about pysam (about 16x faster with -n 12, which is the default option). 😠


🔥 Features

Fast: Uses Rust’s efficient rust-htslib for BAM processing, and supports parallelism.
Python bindings: Seamless integration with Python via pyo3.
Custom filtering: Supports read quality (-q), base quality (-Q), and max depth (-d).
Supports large BAM files: Uses IndexedReader for efficient region querying.


📜 License

rustbam is released under the MIT License. See LICENSE for details.


🤝 Contributing

  1. Fork the repo on GitHub.
  2. Create a new branch: git checkout -b feature-new
  3. Commit your changes: git commit -m "Add new feature"
  4. Push to your branch: git push origin feature-new
  5. Open a Pull Request 🎉

🌍 Acknowledgments

Built using rust-htslib and pyo3.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rustbam-0.2.0.tar.gz (22.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rustbam-0.2.0-cp37-abi3-manylinux_2_28_x86_64.whl (3.1 MB view details)

Uploaded CPython 3.7+manylinux: glibc 2.28+ x86-64

File details

Details for the file rustbam-0.2.0.tar.gz.

File metadata

  • Download URL: rustbam-0.2.0.tar.gz
  • Upload date:
  • Size: 22.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.8.1

File hashes

Hashes for rustbam-0.2.0.tar.gz
Algorithm Hash digest
SHA256 40c2dda93b5e54af4c0a650965d68ce20e557abc5a8911114f08712f70434923
MD5 12f838b783c06e64bb7935dc9525c405
BLAKE2b-256 d38dff81aa20629572e81a00f192fb719ed5754b5bdd730c73aba0cabcc4e3ac

See more details on using hashes here.

File details

Details for the file rustbam-0.2.0-cp37-abi3-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for rustbam-0.2.0-cp37-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 accebd43722d98d6810003b9dd9dca68934f9ced5a42ddd9fd7ccdab61a2251c
MD5 20ed7ce0c8270907362841483b8e0594
BLAKE2b-256 1d7bb43c4c7cd2d22518e75d4994847f187d173ad947184bc9b9a86d4428942e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page