Rust port of scHiCluster's impute_chromosome inner pipeline — drop-in replacement for the published Python implementation. Bit-equivalent numerical behaviour, up to ~10× faster on long Hi-C chromosomes.
Project description
rust-scHiCluster
Rust re-implementation of the inner numerical pipeline of scHiCluster (Zhou et al. 2019, PNAS) — single-cell Hi-C contact-matrix imputation. ~10× faster on long chromosomes at single-cell impute, bit-equivalent within float-32 epsilon.
This is a separate package that re-implements the algorithm in Rust;
it is not a fork of the upstream Python tree. The original Python
implementation continues to be available as pip install schicluster
(and is used here only as the parity baseline in the test suite).
What it computes
schicluster_rs.impute_chromosome(...) runs the same end-to-end
pipeline as schicluster.impute.impute_chromosome.impute_chromosome:
- Read raw single-cell contact matrix from a
.coolfile. - Drop the diagonal.
- 2-D Gaussian convolution (mirror padding) — replaces
scipy.ndimage.gaussian_filter. - Drop the diagonal again.
- Row-normalise → P.
- Random-walk-with-restart fixed point:
Q = (1−rp)·P·Q + rp·Pfor up to 30 iterations or until ‖Q_t − Q_{t-1}‖_F < tol. - Symmetrise:
E = Q + Qᵀ. - SQRTVC normalise:
E ← D^{-1/2} · E · D^{-1/2}whereD = diag(Eᵀ𝟙). - Filter the upper triangle to entries with
j − i ≤ output_dist_bins. - Write the result to an HDF5 file (cooler-compatible).
Steps 2–9 run inside Rust; only the cooler read in step 1 and the HDF5 write in step 10 cross the Python boundary.
Speed
Real Chang 2024 LC462 mouse cortex Droplet Hi-C, 25 kb resolution:
| step | scipy upstream | rust | speedup |
|---|---|---|---|
| chr1 (n = 7820 bins) | 30.5 s | 3.2 s | 9.6× |
| chr19 (n = 2461 bins) | 0.4 s | 0.27 s | 1.5× |
| 20 chrs end-to-end per cell | 87 s | 33 s | 2.7× |
Multi-process parallelism (8 workers × 2 rayon threads = 16 cores total): 8 chr1 in parallel from 29 s → 9.4 s = an additional 3.1× beyond the per-cell speedup, by avoiding rayon thread oversubscription.
Accuracy
Bit-equivalent to upstream within float-32 ε. On real Chang chr1 (n = 7820, ~1.7 M output non-zeros):
max |E_rust − E_scipy|=8.94 × 10⁻⁸- Pearson correlation =
1.000000 - nnz match exactly.
tests/test_parity.py runs random_walk_cpu over (n, rp) ∈
{50, 200, 500} × {0.05, 0.5, 0.9} and asserts max-relative-error < 1e-4
against scipy's reference implementation. All 11 tests pass.
Install
Requires Rust ≥ 1.78 and maturin ≥ 1.4:
git clone https://github.com/omicverse/rust-scHiCluster
cd rust-scHiCluster
maturin develop --release # build + install into the active venv
PyPI release coming soon (pip install schicluster-rs).
Use
Drop-in monkey-patch (recommended) — no code changes anywhere:
import schicluster_rs
schicluster_rs.set_num_threads(2) # 8 workers × 2 = 16 cores
schicluster_rs.patch_schicluster()
# every downstream call to scHiCluster's impute_chromosome now uses Rust:
from schicluster.impute.impute_chromosome import impute_chromosome
impute_chromosome(scool_url=..., chrom='chr1', resolution=25_000,
output_path=..., rp=0.5, tol=0.01,
pad=1, std=1.0, output_dist=10_050_000)
Direct:
from schicluster_rs import random_walk_cpu, impute_chromosome
# Just the iterative RWR step (CSR → CSR):
Q = random_walk_cpu(P, rp=0.5, tol=0.01)
# Full inner pipeline (writes HDF5 like upstream):
impute_chromosome(scool_url='cell.cool', chrom='chr1',
resolution=25_000, output_path='chr1.hdf',
rp=0.5, tol=0.01, pad=1, std=1.0,
output_dist=10_050_000)
Multi-process tuning
schicluster's default workflow is ProcessPoolExecutor(max_workers=N).
Each worker forks the rayon thread pool — without explicit sizing, every
worker spawns num_cpus threads, leading to N × num_cpus contending
threads on a single node.
Set the per-worker rayon thread count via set_num_threads(n) in the
worker initialiser. Recommended sizing: n = num_cpus // num_workers.
Example: 16-core node with 8 workers → set_num_threads(2).
from concurrent.futures import ProcessPoolExecutor
import schicluster_rs
def worker_init():
schicluster_rs.set_num_threads(2)
schicluster_rs.patch_schicluster()
with ProcessPoolExecutor(max_workers=8, initializer=worker_init) as ex:
list(ex.map(impute_one_cell, cells))
Layout
rust-scHiCluster/
├── pyproject.toml maturin build config
├── README.md this file
├── LICENSE MIT
├── rust/
│ ├── Cargo.toml
│ └── src/lib.rs all algorithms (~500 LoC)
├── python/
│ └── schicluster_rs/__init__.py thin Python wrapper + monkey-patch
└── tests/
└── test_parity.py parity vs scipy on random sparse matrices
Algorithm notes
The hot loop is the iterative random-walk-with-restart, which is implemented as a Sparse-times-Dense matrix multiplication (SpMM) with rayon row-wise parallelism:
P(sparse, ~7 nnz per row after Gaussian smoothing) stays as CSR.Q(the iterate) is stored dense, since RWR diffuses it to ≥ 30 % density after 1–2 iterations anyway.- Each iteration:
Q' = (1−rp) · (P · Q) + rp · P. TheP · Qmatmul is computed row-wise; each output row is independent, so rayon splits row-chunks across cores. Within each row, the inner AXPY (accumulateP[i,k] · Q[k, :]for sparse k) vectorises cleanly.
Other steps (Gaussian convolution, SQRTVC normalize, triangle filter) are similarly multi-threaded over rows or chunks.
For users who can tolerate ≪1% deviation from the strict scipy result,
a band_factor parameter is available that runs the RWR with a banded
Q (only entries with |j − i| ≤ band_factor × output_dist_bins),
giving an additional ~4× speedup. Default is 0 (off, strict).
Citation
If you use this package, please cite the original scHiCluster paper:
Zhou, J., Ma, J., Chen, Y., Cheng, C., Bao, B., Peng, J., Sejnowski, T. J., Dixon, J. R. & Ecker, J. R. (2019). Robust single-cell Hi-C clustering by convolution- and random-walk-based imputation. PNAS, 116(28):14011-14018.
License
MIT.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file schicluster_rs-0.1.0.tar.gz.
File metadata
- Download URL: schicluster_rs-0.1.0.tar.gz
- Upload date:
- Size: 17.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
11ccfff16480850e95d6989091d3b78f1fe012fc3ddc19e63fd1c09292082e9a
|
|
| MD5 |
5f6356e3f775f1ed3d7dc3d29184087e
|
|
| BLAKE2b-256 |
7848530672aa4ec1fde84374bf7f217fb36a5da86528a24b9a4db7f0316cb058
|
File details
Details for the file schicluster_rs-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: schicluster_rs-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 325.2 kB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
63968047fe5013adc96e52f4e325fba0ae6651eedf001a9cec65de795540553d
|
|
| MD5 |
bb22bec7b2cffc14846bf4c7037b73e8
|
|
| BLAKE2b-256 |
bc4e0a1dc3a6c4db2581890409e2af60a316e0f14c668a333560b4f82cc2aded
|