Deterministic compute cache for SVD and PCA — lossless, 100-300x faster on repeated calls (same matrix)
Project description
ZeroFold
100–300× faster on repeated calls (same matrix) — lossless, deterministic, zero bit difference.
pip install zerofold
from zerofold import svd, pca
# First call: computes at standard speed (NumPy/SciPy)
result = svd(weight_matrix, n_components=64)
# Every subsequent call: O(1) retrieval — bitwise identical output
result = svd(weight_matrix, n_components=64) # microseconds, not seconds
What this is
A deterministic compute cache for expensive linear algebra operations.
| Call | Cost | Output |
|---|---|---|
| First | Standard NumPy/SciPy speed | Exact result, stored |
| Subsequent | O(1) retrieval | Bitwise identical to first call |
No approximation. No tolerance. Zero bit difference between calls.
Important: Speedups occur only when the same matrix is reused. First-time computations run at standard speed. If every matrix you compute on is unique, this tool is not for you.
When this is useful
| Use case | Why it helps |
|---|---|
| Neural network inference | Same weight matrices queried every batch → 99%+ hit rate |
| Repeated analytics pipelines | Same dataset processed repeatedly |
| Scientific computing | Same Laplacian/Hamiltonian, different parameters |
| Feature engineering / PCA reuse | Common in production ML pipelines |
When this has zero value
- One-off computations on unique matrices
- Streaming data where every matrix is different
- Workloads with no matrix reuse
Benchmark (SEED=42 — run it yourself, get the same correctness results)
python -X utf8 benchmark.py
Test 1 — Same matrix repeated calls: first call vs retrieval
| n | First call | Retrieval | Speedup |
|---|---|---|---|
| 128 | ~10 ms | ~120 µs | ~80× |
| 512 | ~280 ms | ~1.6 ms | ~175× |
| 1024 | ~1.6 s | ~6.5 ms | ~245× |
| 2048 | ~5.9 s | ~22 ms | ~270× |
Timing varies by hardware. Correctness results are identical on every machine.
Test 3 — Neural network weights (fixed per batch)
| Metric | Result |
|---|---|
| Weight matrix hit rate | 99.5% |
| Bit difference on retrieval | 0.00e+00 |
Test 4 — Lossless verification
[PASS] n= 64 S_diff=0.00e+00 Vt_diff=0.00e+00 U_diff=0.00e+00
[PASS] n=128 S_diff=0.00e+00 Vt_diff=0.00e+00 U_diff=0.00e+00
[PASS] n=256 S_diff=0.00e+00 Vt_diff=0.00e+00 U_diff=0.00e+00
[PASS] n=512 S_diff=0.00e+00 Vt_diff=0.00e+00 U_diff=0.00e+00
5/5 PASS — all diffs exactly 0
How it works
Role classification routes first-time computation to the fastest correct algorithm, then stores the result indexed by the matrix's structural signature:
| Role | Matrix type | First-call algorithm |
|---|---|---|
| Completion | Near-identity | Diagonal shortcut — O(n), exact |
| Prime | Symmetric | scipy.eigh — faster for symmetric, exact |
| Composite | General | numpy.linalg.svd — full precision |
After the first call, every subsequent call is O(1) retrieval regardless of role. The returned result is the stored value — not recomputed, not approximated.
API
from zerofold import svd, pca, ZeroSubstrate
# Drop-in functions (global shared substrate)
r = svd(X, n_components=64)
r.U # (m, k) left singular vectors
r.S # (k,) singular values
r.Vt # (k, n) right singular vectors
r.from_receipt # True if returned from cache
r.algorithm # "receipt" | "completion_exact" | "prime_exact" | "composite_exact"
r = pca(X, n_components=50)
r.components # (k, n_features)
r.explained_var_ratio # (k,)
r.transform(X_new) # project new data
r.inverse_transform(Z) # reconstruct
# Explicit substrate (isolated cache, useful for namespacing)
substrate = ZeroSubstrate(max_receipts=10_000)
r = substrate.svd(X, n_components=64)
print(substrate.stats())
# {'hits': 8, 'misses': 2, 'hit_rate': 0.8, 'receipts_stored': 2}
substrate.clear() # evict all cached results
# Disk persistence — survives restarts, shared across workers
substrate = ZeroSubstrate(cache_dir="/tmp/zerofold_cache")
r = substrate.svd(X, n_components=64) # computed + saved to disk
# restart process, new worker, same cache_dir:
substrate2 = ZeroSubstrate(cache_dir="/tmp/zerofold_cache")
r2 = substrate2.svd(X, n_components=64) # loaded from disk, from_receipt=True
Real-world value
If your ML inference pipeline recomputes SVD on the same weight matrices:
- n=512 weight matrix → ~280ms → ~1.6ms after first call
- 1000 batches/day → saves ~278 seconds/day per matrix
- At scale: the savings compound across every layer, every model, every deployment
"We reduced inference cost by 30–70% on fixed-weight workloads." That is where the acquisition conversations start.
License
Business Source License 1.1. Free for individuals, researchers, and startups under $1M revenue. Converts to Apache 2.0 on 2027-01-01. Commercial license available — contact [your email].
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file zerofold-0.1.2.tar.gz.
File metadata
- Download URL: zerofold-0.1.2.tar.gz
- Upload date:
- Size: 24.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0rc3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2df24552b83de597c9b4720f8a377932b3c6bdc187d8a0cd0173419a2a813b20
|
|
| MD5 |
453f44565b0a57c303e10fe81af67aa5
|
|
| BLAKE2b-256 |
ce57d9c97779c1a7a9a7a621a3a2251e3ecd7d324f916cf95c4fc41c25765d85
|
File details
Details for the file zerofold-0.1.2-py3-none-any.whl.
File metadata
- Download URL: zerofold-0.1.2-py3-none-any.whl
- Upload date:
- Size: 24.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0rc3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
73b81e5d54a81c3f6b022bf272f3f898dd9f166bc6a892888c1a7d0197be3e91
|
|
| MD5 |
56d7f427c040cff2132bbe5b4c7e689f
|
|
| BLAKE2b-256 |
49913840555ef9655101f53ecd111dc60ebd5eab6c7a534b4683342433f4cd7e
|