Fast Rust-backed per-base coverage computation over genomic regions from BAM files.
Project description
# Rust_covpyo3
A fast, Rust-backed Python library for computing per-base coverage over genomic regions from BAM files.
---
## Instalation
### dependencies:
python dependencies
```
pip install maturin
pip install numpy
```
rust
to install rust go to : https://www.rust-lang.org/tools/install
and follow the instruction usually it just required to copy paste a link in the terminal.
### install pip
can be installed by pip!
```bash
pip install Rust_covpyo3
```
### local build
If no prebuilt wheel is available for your platform, or you want to build from source, you'll need to compile the Rust backend yourself.
**1. Install Python dependencies**
```bash
pip install maturin numpy
```
**2. Install a recent Rust toolchain**
Follow the official instructions at https://www.rust-lang.org/tools/install — usually a single command pasted into your terminal.
**3. Build and install the wheel**
From the repository root:
```bash
cd Rust_covpyo3
maturin build --release
python -m pip install -U target/wheels/*.whl
```
The build can take 30 seconds to a few minutes depending on your internet connection.
> 💡 If you build multiple times, clear `target/wheels/` first so `pip` only sees one wheel to install.
> 💡 If you're working in a virtual environment and want hot-reloading during development, use `maturin develop` instead of `maturin build`. See the [maturin documentation](https://github.com/PyO3/maturin) for details.
## Usage
### `get_coverage_algo2`
Computes per-base coverage over a genomic region using an interval-based algorithm. Rather than piling up base-by-base, it parses each read's CIGAR string to determine the reference positions it covers, then increments a coverage array for those positions. This makes it efficient for sparse regions and gives you fine-grained control over which reads to include.
```python
from Rust_covpyo3 import get_coverage_algo2
coverage = get_coverage_algo2(
start=10000,
end=20000,
chrom="chr1",
strand="Plus",
bam_path="sample.bam",
lib="frFirstStrand",
mapq_thr=10,
flag_in=0,
flag_exclude=256,
)
# coverage is a list of ints, one per position from start to end
```
### Parameters
| Parameter | Type | Description |
|---|---|---|
| `start` | `int` | Start of the region (0-based, inclusive) |
| `end` | `int` | End of the region (0-based, exclusive) |
| `chrom` | `str` | Chromosome / sequence name |
| `strand` | `str` | `"Plus"`, `"Minus"`, or `"NA"` (unstranded) |
| `bam_path` | `str` | Path to an indexed BAM file |
| `lib` | `str` | Library type — accepted values: `frFirstStrand` (TruSeq stranded), `frSecondStrand`, `fFirstStrand`, `fSecondStrand`, `ffFirstStrand`, `ffSecondStrand`, `rfFirstStrand`, `rfSecondStrand`, `rFirstStrand`, `rSecondStrand`. See [BAMstrandSpecifier](https://github.com/rLannes/BAMstrandSpecifier) |
| `mapq_thr` | `int` | Minimum mapping quality. Set to `0` to disable filtering |
| `flag_in` | `int` | SAM flags that **must** be set (bitwise). Use `0` for no requirement |
| `flag_exclude` | `int` | SAM flags that **must not** be set (bitwise). e.g. `256` to exclude secondary alignments |
### Returns
A `list[int]` of length `end - start`, where each element is the read depth at that position.
### How it works
1. All reads overlapping the `[start, end)` region are fetched from the BAM index.
2. Each read is filtered by `flag_in` / `flag_exclude` and mapping quality.
3. For strand-specific libraries, the read's strand is inferred from its flags and the library type. Only reads matching the requested `strand` are kept. For unstranded libraries, all passing reads are counted.
4. The read's CIGAR string is parsed to extract the intervals on the reference that the read actually covers (skipping deletions and spliced regions).
5. Those intervals are intersected with `[start, end)` and the corresponding positions in the output array are incremented.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rust_covpyo3-0.2.8-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: rust_covpyo3-0.2.8-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.0 MB
- Tags: CPython 3.14, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b18c75e1e060a7aab775d55e05d606576c133ee45d1cccdce58b7b465636364a
|
|
| MD5 |
f3deffb77f4cdc118caeb8a6988b62c2
|
|
| BLAKE2b-256 |
9c5354d203db2d41ece3fd04e794373d43eed91cca0591670b628a3b7069337a
|
File details
Details for the file rust_covpyo3-0.2.8-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: rust_covpyo3-0.2.8-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.0 MB
- Tags: CPython 3.13, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
da0c3789d505270daca39595fe4b913f23c90b32e5ad522f75be4bf5f33a2fa3
|
|
| MD5 |
8c67fe2299cb46f47ff26f45e78a8b46
|
|
| BLAKE2b-256 |
963fd2d444ac06b87d4efe938df059741affec2f7690b6b012e768f3d0512f6e
|
File details
Details for the file rust_covpyo3-0.2.8-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: rust_covpyo3-0.2.8-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.0 MB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
46d548221300a544ad826e5849f9d948b94fff29891d505dda0226c3a0e7eef1
|
|
| MD5 |
603543cab01d02ca5e7c650d755b5212
|
|
| BLAKE2b-256 |
0decd6c6bf4b45e8f54549b238d8688de84fc3b527abf8779c50abe227bc7710
|
File details
Details for the file rust_covpyo3-0.2.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: rust_covpyo3-0.2.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.0 MB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9666e2b59ba6303ca6d51d81b597b2c7bf3f9710cd8083433d659d0c3ed94205
|
|
| MD5 |
b502122d60730fe1e3edc83514ad64c4
|
|
| BLAKE2b-256 |
ae3705cc004661fb960a1b7564d17f19fa2679bf8c073b51713651780e44b6e7
|
File details
Details for the file rust_covpyo3-0.2.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: rust_covpyo3-0.2.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.0 MB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
394fb4a70dd823e5caae1bef10cd01333e321a60a86bcd2860c2c4222cb5b801
|
|
| MD5 |
cd1e709ace5a54126f746ca01b71f0d0
|
|
| BLAKE2b-256 |
82a064734e1285d528c04339fd79e88448c0dfc8ad0f9707338c4799bc27c948
|