High-performance sparse matrix computation library for bioinformatics
Project description
🧬 biosparse
Sparse matrices. Reimagined for biology.
1000x faster than scipy. 10-100x faster than scanpy.
Zero-cost slicing. Numba-native. Production-ready.
Why biosparse?
biosparse is built on three pillars:
1️⃣ Biology-First Sparse Matrices
A custom sparse matrix format designed for how biologists actually work:
- Zero-cost slicing & stacking - Subset genes/cells without copying data
- scipy/numpy compatible -
from_scipy(),to_scipy(), works with your existing code - Memory efficient - Views instead of copies, reduced memory footprint
from biosparse import CSRF64
import scipy.sparse as sp
# From scipy (zero-copy available)
csr = CSRF64.from_scipy(scipy_mat, copy=False)
# Zero-cost operations
subset = csr[1000:2000, :] # No data copy
stacked = CSRF64.vstack([csr1, csr2]) # Efficient concatenation
# Back to scipy when needed
scipy_mat = csr.to_scipy()
2️⃣ High-Performance Kernels
Battle-tested algorithms built on our sparse matrix, compiled with Numba JIT:
| Algorithm | vs scipy | vs scanpy |
|---|---|---|
| Sparse nonlinear ops | 1,000 - 10,000x | - |
| HVG selection | - | 10 - 100x |
| Mann-Whitney U | - | 10 - 100x |
| t-test | - | 10 - 100x |
Speedup scales with core count
Supported:
- HVG: Seurat, Seurat V3, Cell Ranger, Pearson residuals
- Stats: Mann-Whitney U, Welch's t-test, Student's t-test, MMD
3️⃣ Numba Optimization Toolkit
The secret sauce: tools that make Numba JIT outperform hand-written C++.
from biosparse.optim import parallel_jit, assume, vectorize, likely
@parallel_jit
def my_kernel(csr):
assume(csr.nrows > 0) # Enable compiler optimizations
for row in prange(csr.nrows):
values, indices = csr.row_to_numpy(row)
vectorize(8) # SIMD hint
for v in values:
if likely(v > 0): # Branch prediction
# ...
Includes:
- LLVM intrinsics:
assume,likely,unlikely,prefetch - Loop hints:
vectorize,unroll,interleave,distribute - Complete tutorial - 7 chapters from basics to expert
Quick Start
pip install biosparse
from biosparse import CSRF64
from biosparse.kernel import hvg
# Load your data
import scanpy as sc
adata = sc.read_h5ad("data.h5ad")
# Convert (zero-copy)
csr = CSRF64.from_scipy(adata.X.T)
# 100x faster HVG selection
indices, mask, *_ = hvg.hvg_seurat_v3(csr, n_top_genes=2000)
# Use with scanpy
adata.var['highly_variable'] = mask.astype(bool)
Documentation
| Resource | Description |
|---|---|
| Tutorial | 7-chapter guide: from basics to outperforming C++ |
| Sparse API | CSR/CSC matrix reference |
| Kernels | HVG, MWU, t-test documentation |
| Optimization | LLVM intrinsics & loop hints |
License
MIT
Sparse. Fast. Biological.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file biosparse-0.1.1.tar.gz.
File metadata
- Download URL: biosparse-0.1.1.tar.gz
- Upload date:
- Size: 184.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
512ff9b66bae3ecb4b5a06c6f12352bd2dc9c11c8535dfe663abfbdc179c8019
|
|
| MD5 |
9a5df82ddcc497d99b090739410baf87
|
|
| BLAKE2b-256 |
f2441991e25f41db7a11d682f614cf5a22cb932162fffd9f0ad8e1aeb16803dd
|
File details
Details for the file biosparse-0.1.1-py3-none-win_amd64.whl.
File metadata
- Download URL: biosparse-0.1.1-py3-none-win_amd64.whl
- Upload date:
- Size: 380.1 kB
- Tags: Python 3, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d42dd5ea23075794e1f1b8f57097e896e0c79935b6c8769ac8f3055e41ba079f
|
|
| MD5 |
189d751bfe033352cdddbfe5361cfbb6
|
|
| BLAKE2b-256 |
ddc7ab05ebc6b8842b14d9e3309d8c2b28249795d66574277bd90dfc341617a6
|
File details
Details for the file biosparse-0.1.1-py3-none-manylinux_2_17_x86_64.whl.
File metadata
- Download URL: biosparse-0.1.1-py3-none-manylinux_2_17_x86_64.whl
- Upload date:
- Size: 594.9 kB
- Tags: Python 3, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
620a4f007dad9e2a9b6983a9f746f49f800925a530856efcd5f5c532d75447d6
|
|
| MD5 |
5bb82aaa261288643a2202df31d54c43
|
|
| BLAKE2b-256 |
78393f1d9da323c7c4d2abeb70ef7443c905b172871ad1073dd3383edd8bf259
|
File details
Details for the file biosparse-0.1.1-py3-none-manylinux_2_17_aarch64.whl.
File metadata
- Download URL: biosparse-0.1.1-py3-none-manylinux_2_17_aarch64.whl
- Upload date:
- Size: 541.4 kB
- Tags: Python 3, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f9706bf3a14c7d2f14b6639e1f4d5749242357c1780990e5ee01507379e1f683
|
|
| MD5 |
86c786d0c30d7671e1239f9ce155f511
|
|
| BLAKE2b-256 |
c839524200b114bc9dc111dddb1f89c8ed9cecf9b416e841cd985f1b88b0c102
|
File details
Details for the file biosparse-0.1.1-py3-none-macosx_11_0_arm64.whl.
File metadata
- Download URL: biosparse-0.1.1-py3-none-macosx_11_0_arm64.whl
- Upload date:
- Size: 483.0 kB
- Tags: Python 3, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3d9ed2992c6c54f2631c40c6e546fcfc27f4833539933740cf58cd7092614749
|
|
| MD5 |
811930b01d1142974757aa9ec064a8b1
|
|
| BLAKE2b-256 |
37f9ab180d4863422704403d36db849615f47c995584fbf66b4d185fddf65584
|