Statistical Test for spatial patterns using k-nearest neighbors (STORM)
Project description
STORM-OMICS
A Principled Statistical Framework for Analyzing Spatial Patterns in Spatially Resolved Multi-Omics
STORM is a principled statistical method for identifying, quantifying, and comparing spatial patterns in spatially resolved multi-omics data. Unlike conventional approaches that focus solely on statistical significance, STORM provides a unified framework for:
- Identifying spatially variable features (SVFs)
- Quantifying the magnitude of spatial dependency through interpretable effect sizes
- Comparing spatial patterns across biological conditions
- Performing statistical power analysis and sample size determination for spatial studies
The framework is designed to support both single-sample and multi-sample analyses, enabling rigorous statistical inference in modern spatial transcriptomics and spatial multi-omics experiments.
This repository provides a standalone Python package equivalent to CastleLi/STORM.
Features
- Approximate and finite-sample STORM tests
- Dense NumPy, pandas DataFrame, and SciPy sparse expression matrices
- Batched CPU execution that preserves sparse inputs
- Optional batched PyTorch CUDA execution
- Reusable immutable exact k-NN graphs for repeated analyses
- Treatment-versus-control and treatment-versus-treatment comparisons
- Power analysis for all three tests
Installation
The PyPI distribution name is storm-omics; the Python import
is storm_omics. Python identifiers cannot contain hyphens. The shorter
storm distribution name belongs to an unrelated
Canonical ORM package. Until the first PyPI release, install from a source
checkout:
python -m pip install .
For local development:
# Run from the root of a source checkout.
python -m pip install -e .
GPU support
Install a CUDA-enabled PyTorch build appropriate for your operating system, driver, and CUDA runtime using the official PyTorch installer, then install STORM:
python -m pip install ".[gpu]"
GPU execution is opt-in with use_gpu=True. If PyTorch or CUDA is not
available, STORM uses the CPU. If a CUDA operation fails, STORM emits a
RuntimeWarning and recomputes on the CPU.
Spatial Pattern Test
The expression matrix must have shape M x N: spots in rows and features in
columns. Coordinates must have shape M x D.
import pandas as pd
import storm_omics
data = pd.read_csv("spatial_data.csv")
coords = data[["x", "y", "z"]].to_numpy()
expression = data.drop(columns=["x", "y", "z"])
result = storm_omics.storm(
coords,
expression,
k_nn=50,
approx=True,
use_gpu=False,
)
storm returns a DataFrame with:
gene_names: DataFrame column names, or generated names for array inputsp_values: two-sided p-values for spatial dependencyeffect_size: spatial effect size,1 - S2 / S0
Example output:
| gene_names | p_values | effect_size |
|---|---|---|
GeneA |
0.0001 | 0.065 |
GeneB |
0.0123 | 0.041 |
k_nn defaults to 50 for the approximation and 100 for the finite-sample
calculation. It is capped at M - 1. Constant features receive a neutral
p-value of 1 and effect size of 0.
To use CUDA:
result_gpu = storm_omics.storm(coords, expression, k_nn=50, use_gpu=True)
On the CPU, sparse inputs remain sparse throughout each feature batch. CUDA uses a sparse neighbor graph but transfers dense feature batches, with the batch size chosen from currently available VRAM. Large matrices remain float32; column reductions use bounded partial sums accumulated in float64 for CPU/GPU agreement without full-size float64 GPU copies.
Reference performance
Measured June 19, 2026 with Python 3.13, PyTorch 2.10.0, CUDA 12.8, an NVIDIA
GeForce RTX 5070 Ti (15.9 GB), k_nn=50, approx=True, and the included real
2,308-spot by 10,000-feature dataset tiled with small coordinate jitter:
| Copies | Spots | CPU | GPU | Speedup | CPU-run peak | GPU-run host peak | GPU peak VRAM |
|---|---|---|---|---|---|---|---|
| 1x | 2,308 | 0.243 s | 0.092 s | 2.64x | 185 MB | 156 MB | 193 MB |
| 2x | 4,616 | 0.476 s | 0.160 s | 2.97x | 265 MB | 248 MB | 259 MB |
| 3x | 6,924 | 0.717 s | 0.232 s | 3.09x | 357 MB | 340 MB | 260 MB |
| 4x | 9,232 | 0.903 s | 0.298 s | 3.03x | 449 MB | 440 MB | 262 MB |
All four CPU/GPU comparisons had identical p < 0.05 calls and maximum
p-value differences below 4.5e-5. Host peak allocations are measured inside
storm and exclude the already-loaded input DataFrame and interpreter. GPU
memory is PyTorch peak allocated VRAM. Timings are median of three; memory-mode
timings include tracing overhead and are therefore not shown here.
Reusing the exact neighbor graph
When several expression matrices share identical coordinates and neighbor count, prepare the graph once:
graph = storm_omics.prepare_storm_graph(coords, k_nn=50)
result_a = storm_omics.storm(coords, expression_a, graph=graph)
result_b = storm_omics.storm(coords, expression_b, graph=graph, use_gpu=True)
prepare_storm_graph runs the same exact directed k-NN construction as the
ordinary call. The graph is immutable, storm verifies that its coordinates
and neighbor count match, and exact-mode's graph-only scaling constant is
cached lazily. Reuse therefore changes only setup cost, not any statistic.
On the 4x real dataset, median repeated-call time decreased from 0.925 s to 0.871 s on CPU and from 0.360 s to 0.231 s on GPU. Cold graph preparation took 0.198 s, so it paid for itself after about four CPU analyses or two GPU analyses in this environment.
Group Comparisons
Treatment versus control
stormtrt implements a one-sided pooled two-proportion test with a default
per-sample significance threshold of 0.05.
import pandas as pd
import storm_omics
data = pd.DataFrame({
"Group": ["Control"] * 4 + ["Treatment"] * 4,
"Pvalue": [0.40, 0.20, 0.01, 0.30, 0.01, 0.02, 0.03, 0.20],
})
comparison = storm_omics.stormtrt(data, control="Control", sig_level=0.05)
Two treatment groups
data = pd.DataFrame({
"Group": ["A"] * 3 + ["B"] * 3,
"EffectSize": [0.10, 0.11, 0.09, 0.02, 0.03, 0.01],
"M": [3000] * 6,
"K": [50] * 6,
})
comparison = storm_omics.storm2trt(data, alternative="two.sided")
Power Analysis
Exactly one parameter described as unknown must be None.
import storm_omics
# Single-sample STORM: solve for number of spots.
single = storm_omics.power_storm(
es=0.06, n=None, power=0.80, sig_level=0.05
)
# Treatment versus control: solve for samples per control group.
control = storm_omics.powertrt(
nsample=None,
power=0.90,
sig_level=0.05,
ratio=1,
power_single=0.80,
sig_level_single=0.05,
)
# Two treatments: solve for group-1 sample size.
two_treatments = storm_omics.power2trt(
delta=0.04,
phi=4 / (50 * 3000),
psi1=0.010,
psi2=0.005,
nsample=None,
ratio=2,
sig_level=0.05,
power=0.80,
)
Power calculations return continuous sample-size estimates. Round up before using them as study sizes.
Tests and Benchmarks
python -m pytest
python test/benchmark_cpu_gpu.py --multipliers 1 2 3 4 --reps 3
python test/benchmark_cpu_gpu.py --multipliers 4 --reps 5 --reuse-graph
python test/benchmark_memory.py --multipliers 1 2 3 4
Benchmark results depend on data density, feature count, neighbor count, CPU, GPU, CUDA/PyTorch versions, and available memory. Run the included scripts on the target system rather than relying on timings from another machine.
Upstream Reproduction
The upstream repository contains the manuscript reproduction scripts for data
processing, simulations, benchmark analyses, figures, and supplementary
analyses in its
Reproduction
directory. Those R and manuscript-specific scripts are not duplicated in this
Python distribution.
Citation
When using this implementation, cite the STORM method and the upstream CastleLi/STORM software. The upstream manuscript citation is still listed as pending and should be added here once its final bibliographic record is available.
STORM authors: Jinpu Li (ORCID 0000-0002-6656-2896) and Yiqing Wang.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file storm_omics-1.0.0.tar.gz.
File metadata
- Download URL: storm_omics-1.0.0.tar.gz
- Upload date:
- Size: 29.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9be5f2b94f00a32ca55035d22f12c24119944d46f29b4ed7564932305b41bed2
|
|
| MD5 |
50700c730043cd5392e6c86982e5e325
|
|
| BLAKE2b-256 |
c64cddd97552a5e66da79797ae052fe8d5f9c083f2d55998f6009880a35db9a8
|
Provenance
The following attestation bundles were made for storm_omics-1.0.0.tar.gz:
Publisher:
pypi_release.yml on YQ-Wang/STORM
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
storm_omics-1.0.0.tar.gz -
Subject digest:
9be5f2b94f00a32ca55035d22f12c24119944d46f29b4ed7564932305b41bed2 - Sigstore transparency entry: 1885537222
- Sigstore integration time:
-
Permalink:
YQ-Wang/STORM@ccadc154665d7c396a5b50a7eb5f64c39c500ebc -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/YQ-Wang
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi_release.yml@ccadc154665d7c396a5b50a7eb5f64c39c500ebc -
Trigger Event:
push
-
Statement type:
File details
Details for the file storm_omics-1.0.0-py3-none-any.whl.
File metadata
- Download URL: storm_omics-1.0.0-py3-none-any.whl
- Upload date:
- Size: 26.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f5af64867dcb1673693ebfdb371b8e43105d61a31498737753758a4d6b249731
|
|
| MD5 |
5a7c15e3464643a897145d1fe50302e2
|
|
| BLAKE2b-256 |
82585220fca28d789a31a369dd11d89148fb40cc3b8053e4599846446d643e04
|
Provenance
The following attestation bundles were made for storm_omics-1.0.0-py3-none-any.whl:
Publisher:
pypi_release.yml on YQ-Wang/STORM
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
storm_omics-1.0.0-py3-none-any.whl -
Subject digest:
f5af64867dcb1673693ebfdb371b8e43105d61a31498737753758a4d6b249731 - Sigstore transparency entry: 1885537331
- Sigstore integration time:
-
Permalink:
YQ-Wang/STORM@ccadc154665d7c396a5b50a7eb5f64c39c500ebc -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/YQ-Wang
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi_release.yml@ccadc154665d7c396a5b50a7eb5f64c39c500ebc -
Trigger Event:
push
-
Statement type: