Sparse convolution in python using Toeplitz convolution matrix multiplication.
Project description
sparse_convolution
Sparse 2D convolution in Python via Toeplitz matrix methods.
Fast when the kernel is small, the input is sparse, and/or many arrays share the same kernel.
Install
pip install sparse_convolution
Or from source:
git clone https://github.com/RichieHakim/sparse_convolution
cd sparse_convolution
pip install -e .
Usage
Single image
import numpy as np
import scipy.sparse
import sparse_convolution as sc
x = scipy.sparse.random(100, 100, density=0.01)
k = np.random.rand(5, 5)
conv = sc.Toeplitz_convolution2d(x_shape=x.shape, k=k, mode='same')
out = conv(x=x, batching=False).toarray()
Batched
Input: (n_images, H * W) sparse matrix. Output: (n_images, H_out * W_out).
x_batch = scipy.sparse.vstack([
scipy.sparse.random(100, 100, density=0.01).reshape(1, -1)
for _ in range(50)
]).tocsr()
conv = sc.Toeplitz_convolution2d(x_shape=(100, 100), k=k, mode='same')
out = conv(x=x_batch, batching=True)
Methods and backends
Four methods, each with selectable backends:
| Method | numpy | numba | torch |
|---|---|---|---|
direct |
n/a | yes | n/a |
precomputed |
yes | yes | yes |
lazy |
yes | n/a | yes |
gather_scatter |
yes | yes | yes |
direct(default): Batch-parallel scatter convolution with thread-local dense buffers (numba only). For each image in parallel, scatters kernel-weighted input values into an L2-cache-sized accumulator buffer, then extracts nonzeros into CSR format. Uses a two-phase approach: a lightweight boolean counting pass (1-byte flags, no float arithmetic) determines exact output sizes, then the scatter pass writes directly to right-sized arrays with zero waste. Interior pixels (~92-100%) skip bounds checking entirely via precomputed safe regions. O(nnz × K) per image with no init overhead. Fastest method across nearly all configurations. Requiresnumba.precomputed: Builds a sparse Toeplitz matrix at init; fast batched matmul. Best for large batches with the same kernel when numba is not available.lazy: COO broadcasting, no init cost. Best for very sparse inputs with small batches.gather_scatter: Per-kernel-position scatter into a dense accumulator. General-purpose method for sparse batched inputs.
Backend selection:
numpy: scipy/numpy ops. Always available.numba: JIT-compiled parallel loops. Fastest on CPU for batched inputs. Requiresnumba.torch: PyTorch ops with optional GPU. Requirestorch.
conv = sc.Toeplitz_convolution2d(
x_shape=(100, 100),
k=k,
mode='same',
method='direct', # default
backend='numba', # auto-selected for direct
)
If backend=None (default), auto-selects numba for direct and gather_scatter (if installed), numpy otherwise.
References
- Toeplitz convolution: stackoverflow.com/a/51865516, alisaaalehi/convolution_as_multiplication
- 1D convolution matrix: scipy.linalg.convolution_matrix
Benchmarks
All benchmarks run on CPU with 1s minimum measurement time per configuration (median reported). Nine method+backend combinations compared across six scaling sweeps.
Scaling overview
Six scaling sweeps varying batch size, density, image size, and kernel size. direct+numba (brown stars) is the fastest method in nearly all regimes.
Grid search: fastest method per configuration
Each cell shows the winning method and total time (init + call) for that batch size × density combination. direct+numba wins 28 of 36 configurations with an average 4.75× speedup over the second-fastest method.
Individual scaling curves
Batch size scaling — 100×100, 5×5, density=0.01
Density scaling — 100×100, 5×5, batch=100
Image size scaling — 5×5, density=0.01, batch=50
Kernel size scaling — 100×100, density=0.01, batch=50
Batch scaling — high density (0.1)
Batch scaling — very sparse (density=0.001, 200×200)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sparse_convolution-0.4.0.tar.gz.
File metadata
- Download URL: sparse_convolution-0.4.0.tar.gz
- Upload date:
- Size: 27.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8675b8e4023f653df98a8b09bc3fec778e916c4398688d321b7af811a0b0c5cc
|
|
| MD5 |
fceb6c7e81fb1fa84352cbfcd870535a
|
|
| BLAKE2b-256 |
08fac92cfafd3a98511665c0014f0077272352c21fbb09e6a8b024082acc401c
|
Provenance
The following attestation bundles were made for sparse_convolution-0.4.0.tar.gz:
Publisher:
pypi_release.yml on RichieHakim/sparse_convolution
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sparse_convolution-0.4.0.tar.gz -
Subject digest:
8675b8e4023f653df98a8b09bc3fec778e916c4398688d321b7af811a0b0c5cc - Sigstore transparency entry: 1071937824
- Sigstore integration time:
-
Permalink:
RichieHakim/sparse_convolution@599aad54600385d795569958c8985fd0d9f06e69 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/RichieHakim
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi_release.yml@599aad54600385d795569958c8985fd0d9f06e69 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file sparse_convolution-0.4.0-py3-none-any.whl.
File metadata
- Download URL: sparse_convolution-0.4.0-py3-none-any.whl
- Upload date:
- Size: 25.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2a49df61e2d7d359a52129fa09eb0f12db8f4579673fd98c6e748a153c241b80
|
|
| MD5 |
dbc3aa6dd358a7cc5ab920acd96a92d4
|
|
| BLAKE2b-256 |
ff8a665554aa104f8c1ccfcbec3d3da8047523ef7f4e6e1c30fa976c2cd6cf98
|
Provenance
The following attestation bundles were made for sparse_convolution-0.4.0-py3-none-any.whl:
Publisher:
pypi_release.yml on RichieHakim/sparse_convolution
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sparse_convolution-0.4.0-py3-none-any.whl -
Subject digest:
2a49df61e2d7d359a52129fa09eb0f12db8f4579673fd98c6e748a153c241b80 - Sigstore transparency entry: 1071937901
- Sigstore integration time:
-
Permalink:
RichieHakim/sparse_convolution@599aad54600385d795569958c8985fd0d9f06e69 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/RichieHakim
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi_release.yml@599aad54600385d795569958c8985fd0d9f06e69 -
Trigger Event:
workflow_dispatch
-
Statement type: