Skip to main content

Fast Rust-backed per-base coverage computation over genomic regions from BAM files.

Project description

# Rust_covpyo3

A fast, Rust-backed Python library for computing per-base coverage over genomic regions from BAM files.

---



## Instalation

### dependencies:
python dependencies
```
pip install maturin
pip install numpy
```

rust
to install rust go to : https://www.rust-lang.org/tools/install
and follow the instruction usually it just required to copy paste a link in the terminal.

### install pip
can be installed by pip!
```bash
pip install Rust_covpyo3
```

### local build
If no prebuilt wheel is available for your platform, or you want to build from source, you'll need to compile the Rust backend yourself.

**1. Install Python dependencies**

```bash
pip install maturin numpy
```

**2. Install a recent Rust toolchain**

Follow the official instructions at https://www.rust-lang.org/tools/install — usually a single command pasted into your terminal.

**3. Build and install the wheel**

From the repository root:

```bash
cd Rust_covpyo3
maturin build --release
python -m pip install -U target/wheels/*.whl
```

The build can take 30 seconds to a few minutes depending on your internet connection.

> 💡 If you build multiple times, clear `target/wheels/` first so `pip` only sees one wheel to install.

> 💡 If you're working in a virtual environment and want hot-reloading during development, use `maturin develop` instead of `maturin build`. See the [maturin documentation](https://github.com/PyO3/maturin) for details.


## Usage

### `get_coverage_algo2`

Computes per-base coverage over a genomic region using an interval-based algorithm. Rather than piling up base-by-base, it parses each read's CIGAR string to determine the reference positions it covers, then increments a coverage array for those positions. This makes it efficient for sparse regions and gives you fine-grained control over which reads to include.

```python
from Rust_covpyo3 import get_coverage_algo2

coverage = get_coverage_algo2(
start=10000,
end=20000,
chrom="chr1",
strand="Plus",
bam_path="sample.bam",
lib="frFirstStrand",
mapq_thr=10,
flag_in=0,
flag_exclude=256,
)
# coverage is a list of ints, one per position from start to end
```

### Parameters

| Parameter | Type | Description |
|---|---|---|
| `start` | `int` | Start of the region (0-based, inclusive) |
| `end` | `int` | End of the region (0-based, exclusive) |
| `chrom` | `str` | Chromosome / sequence name |
| `strand` | `str` | `"Plus"`, `"Minus"`, or `"NA"` (unstranded) |
| `bam_path` | `str` | Path to an indexed BAM file |
| `lib` | `str` | Library type — accepted values: `frFirstStrand` (TruSeq stranded), `frSecondStrand`, `fFirstStrand`, `fSecondStrand`, `ffFirstStrand`, `ffSecondStrand`, `rfFirstStrand`, `rfSecondStrand`, `rFirstStrand`, `rSecondStrand`. See [BAMstrandSpecifier](https://github.com/rLannes/BAMstrandSpecifier) |
| `mapq_thr` | `int` | Minimum mapping quality. Set to `0` to disable filtering |
| `flag_in` | `int` | SAM flags that **must** be set (bitwise). Use `0` for no requirement |
| `flag_exclude` | `int` | SAM flags that **must not** be set (bitwise). e.g. `256` to exclude secondary alignments |

### Returns

A `list[int]` of length `end - start`, where each element is the read depth at that position.

### How it works

1. All reads overlapping the `[start, end)` region are fetched from the BAM index.
2. Each read is filtered by `flag_in` / `flag_exclude` and mapping quality.
3. For strand-specific libraries, the read's strand is inferred from its flags and the library type. Only reads matching the requested `strand` are kept. For unstranded libraries, all passing reads are counted.
4. The read's CIGAR string is parsed to extract the intervals on the reference that the read actually covers (skipping deletions and spliced regions).
5. Those intervals are intersected with `[start, end)` and the corresponding positions in the output array are incremented.


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rust_covpyo3-0.2.4.tar.gz (17.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rust_covpyo3-0.2.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

rust_covpyo3-0.2.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

rust_covpyo3-0.2.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

rust_covpyo3-0.2.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

rust_covpyo3-0.2.4-cp39-cp39-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

File details

Details for the file rust_covpyo3-0.2.4.tar.gz.

File metadata

  • Download URL: rust_covpyo3-0.2.4.tar.gz
  • Upload date:
  • Size: 17.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.16

File hashes

Hashes for rust_covpyo3-0.2.4.tar.gz
Algorithm Hash digest
SHA256 78b973b693c9834311bcc9b15f285e2fc2875922867339a5c63e72cfd700665d
MD5 ad4797b5262331fc94c62e7624bb2b25
BLAKE2b-256 9206ecccf2f110de651c716be13c0df676b1d1486c75b7e9d9dac66d4437a5d2

See more details on using hashes here.

File details

Details for the file rust_covpyo3-0.2.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_covpyo3-0.2.4-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 38e036890ab1152d47412a0a96d397930ade7f2aae8b0f80ce9806bbd633b349
MD5 375a37510dc99eec3b8cf587239964ff
BLAKE2b-256 101a533bfe90c955212ffaf2bd4c76d00dfdfbf4996f2d805d67e783b57cd0c2

See more details on using hashes here.

File details

Details for the file rust_covpyo3-0.2.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_covpyo3-0.2.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8908e77d25878bb1119b3323b80899e010a26a8f73350e8db310b74899727136
MD5 4ae75f608fd8162c7e1a9d0caa559bad
BLAKE2b-256 c640a2211ed4d0c55de415b3e17d0936718baa1e06c7951ee340e47bf345725c

See more details on using hashes here.

File details

Details for the file rust_covpyo3-0.2.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_covpyo3-0.2.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d0c87d9e9b392e3199e5b021d45817a82fc35415fc69ed8fd00a984c17fc48ce
MD5 84f1f1beb59093080889ab104b6c900a
BLAKE2b-256 c663c008c5770cdf0c1488b1433a07c4bb85ca43d1bc200873f44b0b088431e4

See more details on using hashes here.

File details

Details for the file rust_covpyo3-0.2.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_covpyo3-0.2.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 604871e2731b97e7262c7b8873e03b0749d0050d491ce8960f43df32b252bbfe
MD5 97c3ebcfc2b8ce747785a64b38d4c9ba
BLAKE2b-256 727a08207db42327ae78670f6c838a8c485ef251743ca3496cd0e81e0bb844c5

See more details on using hashes here.

File details

Details for the file rust_covpyo3-0.2.4-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rust_covpyo3-0.2.4-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1ff9e88cb658dccf59d746c50d8d50ef8fbb35f850e989ae83472cc091860ae5
MD5 b783f4792ab1f9aeccd88b815f7b0915
BLAKE2b-256 01954b0248e335162202169e609e6cb1a5d3a1a9502998ce8a376e81cba74047

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page