Numba-accelerated Mann-Whitney U test with sparse matrix support.
Project description
numba-mwu
Numba-accelerated Mann-Whitney U test.
Drop-in replacement for scipy.stats.mannwhitneyu with parallel batch operations and native sparse matrix support.
All functions use the asymptotic (normal approximation) method and produce results identical to scipy.stats.mannwhitneyu(..., method="asymptotic").
Note: This is only supported for 1D and 2D inputs.
Installation
uv pip install numba-mwu
API
Every function returns a MannWhitneyUResult named tuple with statistic and pvalue fields. The batch functions return arrays instead of scalars.
All functions accept use_continuity (default True) and alternative ("two-sided", "less", "greater").
mannwhitneyu(x, y)
Single two-sample test. Equivalent to scipy's mannwhitneyu.
from numba_mwu import mannwhitneyu
result = mannwhitneyu(x, y)
result.statistic # U statistic
result.pvalue # two-sided p-value
mannwhitneyu_rows(X, y)
Test each row of a 2-D array X against a shared reference sample y.
Parallelized across rows.
from numba_mwu import mannwhitneyu_rows
# X: (n_tests, n1), y: (n2,)
result = mannwhitneyu_rows(X, y)
result.statistic # shape (n_tests,)
result.pvalue # shape (n_tests,)
mannwhitneyu_columns(X, Y)
Test each column of X against the corresponding column of Y.
Parallelized across columns.
Designed for the common case of slicing a cells-by-genes matrix into two groups:
from numba_mwu import mannwhitneyu_columns
# expression: (n_cells, n_genes), labels: (n_cells,)
X = expression[labels == "A"] # (n1, n_genes)
Y = expression[labels == "B"] # (n2, n_genes)
result = mannwhitneyu_columns(X, Y)
result.statistic # shape (n_genes,)
result.pvalue # shape (n_genes,)
mannwhitneyu_sparse(X, Y)
Same as mannwhitneyu_columns but operates directly on CSR sparse matrices without converting to dense.
Memory overhead per matrix is one int64 array of length nnz (column permutation) plus one int64 array of length n_genes + 1 (column pointers).
No data values are copied.
Requires non-negative data (raw counts, normalized expression, etc.).
Note: Call
eliminate_zeros()on each matrix beforehand if it may contain explicitly stored zeros.
from numba_mwu import mannwhitneyu_sparse
# adata.X is a CSR matrix, adata.obs["group"] has labels
mask = adata.obs["group"] == "A"
X = adata.X[mask] # CSR row-slice is still CSR
Y = adata.X[~mask]
result = mannwhitneyu_sparse(X, Y)
result.statistic # shape (n_genes,)
result.pvalue # shape (n_genes,)
Benchmarks
Run benchmarks with:
uv run benchmarks/bench_mwu.py
================================================================================
SINGLE PAIR BENCHMARKS (overhead comparison)
================================================================================
--- integer data ---
scenario scipy numba speedup
-----------------------------------------------------------------
n=20 vs n=20 223.1 us 3.9 us 56.9x
n=100 vs n=100 224.0 us 5.4 us 41.7x
n=500 vs n=500 248.3 us 12.6 us 19.7x
n=1000 vs n=1000 287.2 us 22.7 us 12.7x
--- float data ---
scenario scipy numba speedup
-----------------------------------------------------------------
n=20 vs n=20 212.6 us 3.9 us 53.9x
n=100 vs n=100 220.7 us 5.6 us 39.4x
n=500 vs n=500 249.4 us 14.7 us 16.9x
n=1000 vs n=1000 287.3 us 27.4 us 10.5x
================================================================================
DENSE MATRIX BENCHMARKS
================================================================================
--- integer data ---
scenario scipy numba speedup
-----------------------------------------------------------------
small (100x50) 11.4 ms 64.1 us 177.8x
medium (1000x500) 139.5 ms 1.5 ms 94.0x
large (5000x2000) 1.01 s 43.7 ms 23.0x
xlarge (10000x5000) 3.93 s 179.5 ms 21.9x
--- float data ---
scenario scipy numba speedup
-----------------------------------------------------------------
small (100x50) 11.1 ms 53.0 us 208.5x
medium (1000x500) 131.5 ms 1.2 ms 109.1x
large (5000x2000) 866.6 ms 36.0 ms 24.1x
xlarge (10000x5000) 3.33 s 151.9 ms 22.0x
================================================================================
SPARSE MATRIX BENCHMARKS
================================================================================
--- integer data ---
scenario scipy (dense) numba sparse numba dense sp speedup
-------------------------------------------------------------------------------------
small 90% (200x100) 22.7 ms 51.3 us 84.3 us 442.3x
medium 90% (2000x1000) 275.5 ms 1.0 ms 3.5 ms 266.9x
large 95% (5000x2000) 746.8 ms 2.6 ms 20.4 ms 282.1x
xlarge 95% (10000x5000) 2.80 s 21.1 ms 117.2 ms 132.6x
--- float data ---
scenario scipy (dense) numba sparse numba dense sp speedup
-------------------------------------------------------------------------------------
small 90% (200x100) 22.7 ms 53.2 us 80.7 us 427.0x
medium 90% (2000x1000) 279.5 ms 1.0 ms 4.3 ms 268.9x
large 95% (5000x2000) 741.1 ms 3.5 ms 23.7 ms 209.4x
xlarge 95% (10000x5000) 2.80 s 21.0 ms 111.5 ms 133.0x
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file numba_mwu-0.1.1.tar.gz.
File metadata
- Download URL: numba_mwu-0.1.1.tar.gz
- Upload date:
- Size: 7.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.22 {"installer":{"name":"uv","version":"0.9.22","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"EndeavourOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dfb74ff5d84c8923efeb1abbee56e3a4fdf3d8bdb8ce5f50a786d1ac84b3a400
|
|
| MD5 |
1e92b5e1c585bea5bdca264e4410c882
|
|
| BLAKE2b-256 |
32d3a2a6a45b417abd2360034c89667531d7081ac2fd6fdf216f1a2462aa3386
|
File details
Details for the file numba_mwu-0.1.1-py3-none-any.whl.
File metadata
- Download URL: numba_mwu-0.1.1-py3-none-any.whl
- Upload date:
- Size: 10.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.22 {"installer":{"name":"uv","version":"0.9.22","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"EndeavourOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
994cde066cf387b7271baed307e5d9cbc2e1127c51785f3ec9e4b22fd300d7be
|
|
| MD5 |
a75b6688fd59c05925e890cbb0e14257
|
|
| BLAKE2b-256 |
d96cfdb4187eda65273b0ec3935705f4a9ea0c189a7d6061170cc36fce017617
|