Skip to main content

High-performance sparse matrix computation library for bioinformatics

Project description

Python License

🧬 biosparse

Sparse matrices. Reimagined for biology.

1000x faster than scipy. 10-100x faster than scanpy.
Zero-cost slicing. Numba-native. Production-ready.


Why biosparse?

biosparse is built on three pillars:

1️⃣ Biology-First Sparse Matrices

A custom sparse matrix format designed for how biologists actually work:

  • Zero-cost slicing & stacking - Subset genes/cells without copying data
  • scipy/numpy compatible - from_scipy(), to_scipy(), works with your existing code
  • Memory efficient - Views instead of copies, reduced memory footprint
from biosparse import CSRF64
import scipy.sparse as sp

# From scipy (zero-copy available)
csr = CSRF64.from_scipy(scipy_mat, copy=False)

# Zero-cost operations
subset = csr[1000:2000, :]           # No data copy
stacked = CSRF64.vstack([csr1, csr2])  # Efficient concatenation

# Back to scipy when needed
scipy_mat = csr.to_scipy()

2️⃣ High-Performance Kernels

Battle-tested algorithms built on our sparse matrix, compiled with Numba JIT:

Algorithm vs scipy vs scanpy
Sparse nonlinear ops 1,000 - 10,000x -
HVG selection - 10 - 100x
Mann-Whitney U - 10 - 100x
t-test - 10 - 100x

Speedup scales with core count

Supported:

  • HVG: Seurat, Seurat V3, Cell Ranger, Pearson residuals
  • Stats: Mann-Whitney U, Welch's t-test, Student's t-test, MMD

3️⃣ Numba Optimization Toolkit

The secret sauce: tools that make Numba JIT outperform hand-written C++.

from biosparse.optim import parallel_jit, assume, vectorize, likely

@parallel_jit
def my_kernel(csr):
    assume(csr.nrows > 0)  # Enable compiler optimizations
    
    for row in prange(csr.nrows):
        values, indices = csr.row_to_numpy(row)
        
        vectorize(8)  # SIMD hint
        for v in values:
            if likely(v > 0):  # Branch prediction
                # ...

Includes:

  • LLVM intrinsics: assume, likely, unlikely, prefetch
  • Loop hints: vectorize, unroll, interleave, distribute
  • Complete tutorial - 7 chapters from basics to expert

Quick Start

pip install biosparse
from biosparse import CSRF64
from biosparse.kernel import hvg

# Load your data
import scanpy as sc
adata = sc.read_h5ad("data.h5ad")

# Convert (zero-copy)
csr = CSRF64.from_scipy(adata.X.T)

# 100x faster HVG selection
indices, mask, *_ = hvg.hvg_seurat_v3(csr, n_top_genes=2000)

# Use with scanpy
adata.var['highly_variable'] = mask.astype(bool)

Documentation

Resource Description
Tutorial 7-chapter guide: from basics to outperforming C++
Sparse API CSR/CSC matrix reference
Kernels HVG, MWU, t-test documentation
Optimization LLVM intrinsics & loop hints

License

MIT


Sparse. Fast. Biological.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biosparse-0.1.0.tar.gz (187.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

biosparse-0.1.0-py3-none-win_amd64.whl (379.7 kB view details)

Uploaded Python 3Windows x86-64

biosparse-0.1.0-py3-none-manylinux_2_17_x86_64.whl (595.7 kB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64

biosparse-0.1.0-py3-none-manylinux_2_17_aarch64.whl (541.1 kB view details)

Uploaded Python 3manylinux: glibc 2.17+ ARM64

biosparse-0.1.0-py3-none-macosx_11_0_arm64.whl (483.0 kB view details)

Uploaded Python 3macOS 11.0+ ARM64

File details

Details for the file biosparse-0.1.0.tar.gz.

File metadata

  • Download URL: biosparse-0.1.0.tar.gz
  • Upload date:
  • Size: 187.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for biosparse-0.1.0.tar.gz
Algorithm Hash digest
SHA256 429935051e2f678f12aca39fb38ee12a3430378a87ce1164bbe51132de4665f4
MD5 c717b6bf92267515396852c68386ce6e
BLAKE2b-256 868a81d651f1df11b2cfb07a448757c84871579bd3a4dd945efbd3a77a2c77b6

See more details on using hashes here.

File details

Details for the file biosparse-0.1.0-py3-none-win_amd64.whl.

File metadata

  • Download URL: biosparse-0.1.0-py3-none-win_amd64.whl
  • Upload date:
  • Size: 379.7 kB
  • Tags: Python 3, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for biosparse-0.1.0-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 6da774a88b1784c7ecd5339579e64ba3af3b82ebadf778976c623879c2d3fc99
MD5 64040bbc9b1e3fbfb02909fd37af7294
BLAKE2b-256 af1989f14406f699e4f4b80987e0f028d6e29034b939e8d5d636bfecaf984a42

See more details on using hashes here.

File details

Details for the file biosparse-0.1.0-py3-none-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for biosparse-0.1.0-py3-none-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 b00b21e3e38286edab36b906281cf783c2657e781b09063defe04f1cd0b24183
MD5 50a7bacb85ca64d850be5bce4b2bba41
BLAKE2b-256 584955ef3a337f98605f68049b35f665dd2a5b2745e72f44b647ed09de0c95b5

See more details on using hashes here.

File details

Details for the file biosparse-0.1.0-py3-none-manylinux_2_17_aarch64.whl.

File metadata

File hashes

Hashes for biosparse-0.1.0-py3-none-manylinux_2_17_aarch64.whl
Algorithm Hash digest
SHA256 6d720af64bcc09af5c67d7d0086b522c6ac419424d4d72e2713d9236a934c930
MD5 d23916e7b2a832b2a2bed779a09f586b
BLAKE2b-256 ced0358ce93228cb85c9195fcfdf3b01f9724f6d65387bcb5a4adad863347d47

See more details on using hashes here.

File details

Details for the file biosparse-0.1.0-py3-none-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for biosparse-0.1.0-py3-none-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 95cadcc84eff8bcf826303727ff538e835f49eaae806044a27caf1eb9af99db3
MD5 3a27ca87e9d75df258aa21fcb492f0aa
BLAKE2b-256 583d3e02ba470f4dd0726fe4fc446e0dfcc904fec491a7e4d7e097bc7527ac7b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page