Blazing fast genomic operations on large Python dataframes
Project description
polars-bio - Next-gen Python DataFrame operations for genomics!
polars-bio is a Python library for genomics built on top of polars, Apache Arrow and Apache DataFusion. It provides a DataFrame API for genomics data and is designed to be blazing fast, memory efficient and easy to use.
Key Features
- optimized for peformance and memory efficiency for large-scale genomics datasets analyses both when reading input data and performing operations
- popular genomics operations with a DataFrame API (both Pandas and polars)
- SQL-powered bioinformatic data querying or manipulation
- native parallel engine powered by Apache DataFusion and sequila-native
- out-of-core/streaming processing (for data too large to fit into a computer's main memory) with Apache DataFusion and polars
- support for federated and streamed reading data from cloud storages (e.g. S3, GCS) with Apache OpenDAL enabling processing large-scale genomics data without materializing in memory
- zero-copy data exchange with Apache Arrow
- bioinformatics file formats with noodles and exon
- fast overlap operations with COITrees: Cache Oblivious Interval Trees
- pre-built wheel packages for Linux, Windows and MacOS (arm64 and x86_64) available on PyPI
Single-thread performance 🏃
Parallel performance 🏃🏃
Citing
If you use polars-bio in your work, please cite:
@article {Wiewiorka2025.03.21.644629,
author = {Wiewiorka, Marek and Khamutou, Pavel and Zbysinski, Marek and Gambin, Tomasz},
title = {polars-bio - fast, scalable and out-of-core operations on large genomic interval datasets},
elocation-id = {2025.03.21.644629},
year = {2025},
doi = {10.1101/2025.03.21.644629},
publisher = {Cold Spring Harbor Laboratory},
URL = {https://www.biorxiv.org/content/early/2025/03/25/2025.03.21.644629},
eprint = {https://www.biorxiv.org/content/early/2025/03/25/2025.03.21.644629.full.pdf},
journal = {bioRxiv}
}
Read the documentation
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file polars_bio-0.9.0.tar.gz.
File metadata
- Download URL: polars_bio-0.9.0.tar.gz
- Upload date:
- Size: 6.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.8.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
27979bbbbe1d3327627a855b39d283ea95970aec4cd31d289e775743c93be42f
|
|
| MD5 |
b56a5053efd96377b17a2dfb545aa306
|
|
| BLAKE2b-256 |
59eb47a1ae34dd0187ef6024b48edd2447e3b3e3dc0b848a19844aed0e1cdebd
|
File details
Details for the file polars_bio-0.9.0-cp38-abi3-win_amd64.whl.
File metadata
- Download URL: polars_bio-0.9.0-cp38-abi3-win_amd64.whl
- Upload date:
- Size: 70.1 MB
- Tags: CPython 3.8+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.8.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c697c5577d6d02439113f6d682e506c253374fd34e428adfcad33d6e69a51502
|
|
| MD5 |
fb62de3eb70263a5d9b96977e55faf71
|
|
| BLAKE2b-256 |
15c92429024e868230dd2dcdd64fe3e5e2c917bf4af4048b174641bcad7aa53b
|
File details
Details for the file polars_bio-0.9.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: polars_bio-0.9.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 79.4 MB
- Tags: CPython 3.8+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.8.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ce65b26b718985de81640d447c48b83089df32b11fba1179ab19b6bca826be20
|
|
| MD5 |
cab1356947877fed8d51287548f21984
|
|
| BLAKE2b-256 |
756bd354ec9eefbae9a89af9e95bb6142badfb24626cee6df063430af9b7ed14
|
File details
Details for the file polars_bio-0.9.0-cp38-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: polars_bio-0.9.0-cp38-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 71.4 MB
- Tags: CPython 3.8+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.8.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a7f90e3109bc4a4794cd7cf6dbe8ac0de49dcbf631fae388b716566af485dcad
|
|
| MD5 |
d99a842674eb9a64a266d7c3d6587059
|
|
| BLAKE2b-256 |
c10682992737aefded86b6a019fd3a0decc48204bba329f03b63b6fe20bec64a
|
File details
Details for the file polars_bio-0.9.0-cp38-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: polars_bio-0.9.0-cp38-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 76.6 MB
- Tags: CPython 3.8+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.8.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a7853a283e31a721ad77161b4df7e3d2ca83abb4dad585468eaeec7e327212e0
|
|
| MD5 |
b0610b07a149e2d0584848fa7486b55b
|
|
| BLAKE2b-256 |
9e57265bd8add12289e429eb82ba297a5fee024cc21ad0169af17818ad7a1cdc
|