Skip to main content

Deterministic compute cache for SVD and PCA — lossless, 100-300x faster on repeated calls (same matrix)

Project description

ZeroFold

100–300× faster on repeated calls (same matrix) — lossless, deterministic, zero bit difference.

pip install zerofold
from zerofold import svd, pca

# First call: computes at standard speed (NumPy/SciPy)
result = svd(weight_matrix, n_components=64)

# Every subsequent call: O(1) retrieval — bitwise identical output
result = svd(weight_matrix, n_components=64)  # microseconds, not seconds

What this is

A deterministic compute cache for expensive linear algebra operations.

Call Cost Output
First Standard NumPy/SciPy speed Exact result, stored
Subsequent O(1) retrieval Bitwise identical to first call

No approximation. No tolerance. Zero bit difference between calls.

Important: Speedups occur only when the same matrix is reused. First-time computations run at standard speed. If every matrix you compute on is unique, this tool is not for you.


When this is useful

Use case Why it helps
Neural network inference Same weight matrices queried every batch → 99%+ hit rate
Repeated analytics pipelines Same dataset processed repeatedly
Scientific computing Same Laplacian/Hamiltonian, different parameters
Feature engineering / PCA reuse Common in production ML pipelines

When this has zero value

  • One-off computations on unique matrices
  • Streaming data where every matrix is different
  • Workloads with no matrix reuse

Benchmark (SEED=42 — run it yourself, get the same correctness results)

python -X utf8 benchmark.py

Test 1 — Same matrix repeated calls: first call vs retrieval

n First call Retrieval Speedup
128 ~10 ms ~120 µs ~80×
512 ~280 ms ~1.6 ms ~175×
1024 ~1.6 s ~6.5 ms ~245×
2048 ~5.9 s ~22 ms ~270×

Timing varies by hardware. Correctness results are identical on every machine.

Test 3 — Neural network weights (fixed per batch)

Metric Result
Weight matrix hit rate 99.5%
Bit difference on retrieval 0.00e+00

Test 4 — Lossless verification

[PASS] n= 64  S_diff=0.00e+00  Vt_diff=0.00e+00  U_diff=0.00e+00
[PASS] n=128  S_diff=0.00e+00  Vt_diff=0.00e+00  U_diff=0.00e+00
[PASS] n=256  S_diff=0.00e+00  Vt_diff=0.00e+00  U_diff=0.00e+00
[PASS] n=512  S_diff=0.00e+00  Vt_diff=0.00e+00  U_diff=0.00e+00
5/5 PASS — all diffs exactly 0

How it works

Role classification routes first-time computation to the fastest correct algorithm, then stores the result indexed by the matrix's structural signature:

Role Matrix type First-call algorithm
Completion Near-identity Diagonal shortcut — O(n), exact
Prime Symmetric scipy.eigh — faster for symmetric, exact
Composite General numpy.linalg.svd — full precision

After the first call, every subsequent call is O(1) retrieval regardless of role. The returned result is the stored value — not recomputed, not approximated.


API

from zerofold import svd, pca, ZeroSubstrate

# Drop-in functions (global shared substrate)
r = svd(X, n_components=64)
r.U             # (m, k) left singular vectors
r.S             # (k,)   singular values
r.Vt            # (k, n) right singular vectors
r.from_receipt  # True if returned from cache
r.algorithm     # "receipt" | "completion_exact" | "prime_exact" | "composite_exact"

r = pca(X, n_components=50)
r.components            # (k, n_features)
r.explained_var_ratio   # (k,)
r.transform(X_new)      # project new data
r.inverse_transform(Z)  # reconstruct

# Explicit substrate (isolated cache, useful for namespacing)
substrate = ZeroSubstrate(max_receipts=10_000)
r = substrate.svd(X, n_components=64)
print(substrate.stats())
# {'hits': 8, 'misses': 2, 'hit_rate': 0.8, 'receipts_stored': 2}

substrate.clear()  # evict all cached results

# Disk persistence — survives restarts, shared across workers
substrate = ZeroSubstrate(cache_dir="/tmp/zerofold_cache")
r = substrate.svd(X, n_components=64)  # computed + saved to disk
# restart process, new worker, same cache_dir:
substrate2 = ZeroSubstrate(cache_dir="/tmp/zerofold_cache")
r2 = substrate2.svd(X, n_components=64)  # loaded from disk, from_receipt=True

Real-world value

If your ML inference pipeline recomputes SVD on the same weight matrices:

  • n=512 weight matrix → ~280ms → ~1.6ms after first call
  • 1000 batches/day → saves ~278 seconds/day per matrix
  • At scale: the savings compound across every layer, every model, every deployment

"We reduced inference cost by 30–70% on fixed-weight workloads." That is where the acquisition conversations start.


License

Business Source License 1.1. Free for individuals, researchers, and startups under $1M revenue. Converts to Apache 2.0 on 2027-01-01. Commercial license available — contact [your email].

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zerofold-0.1.4.tar.gz (25.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zerofold-0.1.4-py3-none-any.whl (25.2 kB view details)

Uploaded Python 3

File details

Details for the file zerofold-0.1.4.tar.gz.

File metadata

  • Download URL: zerofold-0.1.4.tar.gz
  • Upload date:
  • Size: 25.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0rc3

File hashes

Hashes for zerofold-0.1.4.tar.gz
Algorithm Hash digest
SHA256 30a9ba1d6ed91db910d9a956787c88cd71635ede0454564b5886d342ed992c67
MD5 142dc276799f8e4f9ecdc96eb8d2ae66
BLAKE2b-256 bdfa55011777c99bff9e738806c08061b697ec2a520553a113009f3428428261

See more details on using hashes here.

File details

Details for the file zerofold-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: zerofold-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 25.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0rc3

File hashes

Hashes for zerofold-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 522bd2f62fcb0c547a108467e59c947e5846ec93b97bb5867bb5dbe36fedf3d4
MD5 1067d36c8fe1b3aecba8d082103d1f4a
BLAKE2b-256 5467bbce96880b31f811bbfe405c6c561b94765c2154ceeb1ed28cc4d22b8f3d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page