Skip to main content

High-performance sparse matrix computation library for bioinformatics

Project description

Python License

🧬 biosparse

Sparse matrices. Reimagined for biology.

1000x faster than scipy. 10-100x faster than scanpy.
Zero-cost slicing. Numba-native. Production-ready.


Why biosparse?

biosparse is built on three pillars:

1️⃣ Biology-First Sparse Matrices

A custom sparse matrix format designed for how biologists actually work:

  • Zero-cost slicing & stacking - Subset genes/cells without copying data
  • scipy/numpy compatible - from_scipy(), to_scipy(), works with your existing code
  • Memory efficient - Views instead of copies, reduced memory footprint
from biosparse import CSRF64
import scipy.sparse as sp

# From scipy (zero-copy available)
csr = CSRF64.from_scipy(scipy_mat, copy=False)

# Zero-cost operations
subset = csr[1000:2000, :]           # No data copy
stacked = CSRF64.vstack([csr1, csr2])  # Efficient concatenation

# Back to scipy when needed
scipy_mat = csr.to_scipy()

2️⃣ High-Performance Kernels

Battle-tested algorithms built on our sparse matrix, compiled with Numba JIT:

Algorithm vs scipy vs scanpy
Sparse nonlinear ops 1,000 - 10,000x -
HVG selection - 10 - 100x
Mann-Whitney U - 10 - 100x
t-test - 10 - 100x

Speedup scales with core count

Supported:

  • HVG: Seurat, Seurat V3, Cell Ranger, Pearson residuals
  • Stats: Mann-Whitney U, Welch's t-test, Student's t-test, MMD

3️⃣ Numba Optimization Toolkit

The secret sauce: tools that make Numba JIT outperform hand-written C++.

from biosparse.optim import parallel_jit, assume, vectorize, likely

@parallel_jit
def my_kernel(csr):
    assume(csr.nrows > 0)  # Enable compiler optimizations
    
    for row in prange(csr.nrows):
        values, indices = csr.row_to_numpy(row)
        
        vectorize(8)  # SIMD hint
        for v in values:
            if likely(v > 0):  # Branch prediction
                # ...

Includes:

  • LLVM intrinsics: assume, likely, unlikely, prefetch
  • Loop hints: vectorize, unroll, interleave, distribute
  • Complete tutorial - 7 chapters from basics to expert

Quick Start

pip install biosparse
from biosparse import CSRF64
from biosparse.kernel import hvg

# Load your data
import scanpy as sc
adata = sc.read_h5ad("data.h5ad")

# Convert (zero-copy)
csr = CSRF64.from_scipy(adata.X.T)

# 100x faster HVG selection
indices, mask, *_ = hvg.hvg_seurat_v3(csr, n_top_genes=2000)

# Use with scanpy
adata.var['highly_variable'] = mask.astype(bool)

Documentation

Resource Description
Tutorial 7-chapter guide: from basics to outperforming C++
Sparse API CSR/CSC matrix reference
Kernels HVG, MWU, t-test documentation
Optimization LLVM intrinsics & loop hints

License

MIT


Sparse. Fast. Biological.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biosparse-0.1.1.tar.gz (184.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

biosparse-0.1.1-py3-none-win_amd64.whl (380.1 kB view details)

Uploaded Python 3Windows x86-64

biosparse-0.1.1-py3-none-manylinux_2_17_x86_64.whl (594.9 kB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64

biosparse-0.1.1-py3-none-manylinux_2_17_aarch64.whl (541.4 kB view details)

Uploaded Python 3manylinux: glibc 2.17+ ARM64

biosparse-0.1.1-py3-none-macosx_11_0_arm64.whl (483.0 kB view details)

Uploaded Python 3macOS 11.0+ ARM64

File details

Details for the file biosparse-0.1.1.tar.gz.

File metadata

  • Download URL: biosparse-0.1.1.tar.gz
  • Upload date:
  • Size: 184.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for biosparse-0.1.1.tar.gz
Algorithm Hash digest
SHA256 512ff9b66bae3ecb4b5a06c6f12352bd2dc9c11c8535dfe663abfbdc179c8019
MD5 9a5df82ddcc497d99b090739410baf87
BLAKE2b-256 f2441991e25f41db7a11d682f614cf5a22cb932162fffd9f0ad8e1aeb16803dd

See more details on using hashes here.

File details

Details for the file biosparse-0.1.1-py3-none-win_amd64.whl.

File metadata

  • Download URL: biosparse-0.1.1-py3-none-win_amd64.whl
  • Upload date:
  • Size: 380.1 kB
  • Tags: Python 3, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for biosparse-0.1.1-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 d42dd5ea23075794e1f1b8f57097e896e0c79935b6c8769ac8f3055e41ba079f
MD5 189d751bfe033352cdddbfe5361cfbb6
BLAKE2b-256 ddc7ab05ebc6b8842b14d9e3309d8c2b28249795d66574277bd90dfc341617a6

See more details on using hashes here.

File details

Details for the file biosparse-0.1.1-py3-none-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for biosparse-0.1.1-py3-none-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 620a4f007dad9e2a9b6983a9f746f49f800925a530856efcd5f5c532d75447d6
MD5 5bb82aaa261288643a2202df31d54c43
BLAKE2b-256 78393f1d9da323c7c4d2abeb70ef7443c905b172871ad1073dd3383edd8bf259

See more details on using hashes here.

File details

Details for the file biosparse-0.1.1-py3-none-manylinux_2_17_aarch64.whl.

File metadata

File hashes

Hashes for biosparse-0.1.1-py3-none-manylinux_2_17_aarch64.whl
Algorithm Hash digest
SHA256 f9706bf3a14c7d2f14b6639e1f4d5749242357c1780990e5ee01507379e1f683
MD5 86c786d0c30d7671e1239f9ce155f511
BLAKE2b-256 c839524200b114bc9dc111dddb1f89c8ed9cecf9b416e841cd985f1b88b0c102

See more details on using hashes here.

File details

Details for the file biosparse-0.1.1-py3-none-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for biosparse-0.1.1-py3-none-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3d9ed2992c6c54f2631c40c6e546fcfc27f4833539933740cf58cd7092614749
MD5 811930b01d1142974757aa9ec064a8b1
BLAKE2b-256 37f9ab180d4863422704403d36db849615f47c995584fbf66b4d185fddf65584

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page