Skip to main content

Fast Rust-backed per-base coverage computation over genomic regions from BAM files.

Project description

# Rust_covpyo3

A fast, Rust-backed Python library for computing per-base coverage over genomic regions from BAM files.

---



## Instalation

### dependencies:
python dependencies
```
pip install maturin
pip install numpy
```

rust
to install rust go to : https://www.rust-lang.org/tools/install
and follow the instruction usually it just required to copy paste a link in the terminal.

### install pip
can be installed by pip!
```bash
pip install Rust_covpyo3
```

### local build
If no prebuilt wheel is available for your platform, or you want to build from source, you'll need to compile the Rust backend yourself.

**1. Install Python dependencies**

```bash
pip install maturin numpy
```

**2. Install a recent Rust toolchain**

Follow the official instructions at https://www.rust-lang.org/tools/install — usually a single command pasted into your terminal.

**3. Build and install the wheel**

From the repository root:

```bash
cd Rust_covpyo3
maturin build --release
python -m pip install -U target/wheels/*.whl
```

The build can take 30 seconds to a few minutes depending on your internet connection.

> 💡 If you build multiple times, clear `target/wheels/` first so `pip` only sees one wheel to install.

> 💡 If you're working in a virtual environment and want hot-reloading during development, use `maturin develop` instead of `maturin build`. See the [maturin documentation](https://github.com/PyO3/maturin) for details.


## Usage

### `get_coverage_algo2`

Computes per-base coverage over a genomic region using an interval-based algorithm. Rather than piling up base-by-base, it parses each read's CIGAR string to determine the reference positions it covers, then increments a coverage array for those positions. This makes it efficient for sparse regions and gives you fine-grained control over which reads to include.

```python
from Rust_covpyo3 import get_coverage_algo2

coverage = get_coverage_algo2(
start=10000,
end=20000,
chrom="chr1",
strand="Plus",
bam_path="sample.bam",
lib="frFirstStrand",
mapq_thr=10,
flag_in=0,
flag_exclude=256,
)
# coverage is a list of ints, one per position from start to end
```

### Parameters

| Parameter | Type | Description |
|---|---|---|
| `start` | `int` | Start of the region (0-based, inclusive) |
| `end` | `int` | End of the region (0-based, exclusive) |
| `chrom` | `str` | Chromosome / sequence name |
| `strand` | `str` | `"Plus"`, `"Minus"`, or `"NA"` (unstranded) |
| `bam_path` | `str` | Path to an indexed BAM file |
| `lib` | `str` | Library type — accepted values: `frFirstStrand` (TruSeq stranded), `frSecondStrand`, `fFirstStrand`, `fSecondStrand`, `ffFirstStrand`, `ffSecondStrand`, `rfFirstStrand`, `rfSecondStrand`, `rFirstStrand`, `rSecondStrand`. See [BAMstrandSpecifier](https://github.com/rLannes/BAMstrandSpecifier) |
| `mapq_thr` | `int` | Minimum mapping quality. Set to `0` to disable filtering |
| `flag_in` | `int` | SAM flags that **must** be set (bitwise). Use `0` for no requirement |
| `flag_exclude` | `int` | SAM flags that **must not** be set (bitwise). e.g. `256` to exclude secondary alignments |

### Returns

A `list[int]` of length `end - start`, where each element is the read depth at that position.

### How it works

1. All reads overlapping the `[start, end)` region are fetched from the BAM index.
2. Each read is filtered by `flag_in` / `flag_exclude` and mapping quality.
3. For strand-specific libraries, the read's strand is inferred from its flags and the library type. Only reads matching the requested `strand` are kept. For unstranded libraries, all passing reads are counted.
4. The read's CIGAR string is parsed to extract the intervals on the reference that the read actually covers (skipping deletions and spliced regions).
5. Those intervals are intersected with `[start, end)` and the corresponding positions in the output array are incremented.


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rust_covpyo3-0.2.8-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64

rust_covpyo3-0.2.8-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

rust_covpyo3-0.2.8-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

rust_covpyo3-0.2.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

rust_covpyo3-0.2.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

File details

Details for the file rust_covpyo3-0.2.8-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_covpyo3-0.2.8-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b18c75e1e060a7aab775d55e05d606576c133ee45d1cccdce58b7b465636364a
MD5 f3deffb77f4cdc118caeb8a6988b62c2
BLAKE2b-256 9c5354d203db2d41ece3fd04e794373d43eed91cca0591670b628a3b7069337a

See more details on using hashes here.

File details

Details for the file rust_covpyo3-0.2.8-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_covpyo3-0.2.8-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 da0c3789d505270daca39595fe4b913f23c90b32e5ad522f75be4bf5f33a2fa3
MD5 8c67fe2299cb46f47ff26f45e78a8b46
BLAKE2b-256 963fd2d444ac06b87d4efe938df059741affec2f7690b6b012e768f3d0512f6e

See more details on using hashes here.

File details

Details for the file rust_covpyo3-0.2.8-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_covpyo3-0.2.8-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 46d548221300a544ad826e5849f9d948b94fff29891d505dda0226c3a0e7eef1
MD5 603543cab01d02ca5e7c650d755b5212
BLAKE2b-256 0decd6c6bf4b45e8f54549b238d8688de84fc3b527abf8779c50abe227bc7710

See more details on using hashes here.

File details

Details for the file rust_covpyo3-0.2.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_covpyo3-0.2.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9666e2b59ba6303ca6d51d81b597b2c7bf3f9710cd8083433d659d0c3ed94205
MD5 b502122d60730fe1e3edc83514ad64c4
BLAKE2b-256 ae3705cc004661fb960a1b7564d17f19fa2679bf8c073b51713651780e44b6e7

See more details on using hashes here.

File details

Details for the file rust_covpyo3-0.2.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rust_covpyo3-0.2.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 394fb4a70dd823e5caae1bef10cd01333e321a60a86bcd2860c2c4222cb5b801
MD5 cd1e709ace5a54126f746ca01b71f0d0
BLAKE2b-256 82a064734e1285d528c04339fd79e88448c0dfc8ad0f9707338c4799bc27c948

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page