Skip to main content

Sparse Spectral Encoding for cold-tier vector memory (Rust port with PyO3 bindings)

Project description

SpectraLTM

PyPI version Python 3.10+ License: Apache 2.0

Rust + PyO3 port of the Sparse Spectral Encoding algorithm for cold-tier vector memory

spectraltm is the AVX2-accelerated Rust implementation of the spectral_codes reference from the Sparse Spectral Encoding for Cold-Tier Vector Memory research project. Same algorithm, 5–10× faster, exposed as a drop-in Python module.

Features

  • SIMD-accelerated inner loop: AVX2 fast-path on x86_64; portable scalar fallback for other platforms
  • Drop-in Python API: SpectralEncoder, SpectralIndex, .encode(), .add_embeddings(), .search() — identical to the Python reference
  • Batch-first: encode and search batches, not single vectors
  • 24–192 B per chunk: tiny cold-tier storage vs float32 embeddings (~1.5 KB per vector at dim=384)
  • Sub-millisecond search on N=5K corpora; scales linearly with N
  • Parity-tested: tests/test_spectraltm_rust_parity.py cross-checks against the Python reference for top-1/top-10 agreement

Installation

pip install spectraltm

Requires Python 3.10+ and numpy>=1.20. As of v0.1.1 the Python bindings only accept plain Python lists (use arr.tolist() on any numpy array before passing it in), due to a PyO3 / numpy ABI mismatch that crashes on numpy >= 2.4. See the Quick start example for the working pattern.

Quick start

import numpy as np
from spectraltm import SpectralEncoder, SpectralIndex

# Build the encoder (calibrate first on a representative sample).
# NOTE: pass plain Python lists, not numpy arrays. The Rust PyO3 binding
# crashes on numpy >= 2.4 due to an ABI mismatch (PyO3 0.29 was built
# against an older numpy ABI). The .tolist() path works on every numpy
# version and is what the parity test suite uses.
enc = SpectralEncoder(dim=384, top_k=64, mag_bits=8, phase_bits=8, norm_bits=8)
sample = np.random.randn(1000, 384).astype(np.float32)
enc.calibrate(sample.reshape(-1).tolist())

# Build the index
db = np.random.randn(10_000, 384).astype(np.float32)
idx = SpectralIndex(enc)
idx.add_embeddings(db.tolist())

# Search
queries = np.random.randn(100, 384).astype(np.float32)
top_idx, top_scores = idx.search_batch(queries.tolist(), top_k=10)
# search_batch returns nested Python lists of shape (n_queries, top_k);
# convert to numpy if you want shape/dtype ergonomics.
import numpy as np
top_idx = np.array(top_idx)        # shape: (100, 10)
top_scores = np.array(top_scores)  # shape: (100, 10)

Benchmarks — BEIR scifact (k=10)

Honest framing below the table: spectraltm is a lossy compressed index, not a replacement for brute-force cosine. Pick K based on your quality bar.

==========================================================================================
SUMMARY  (BEIR scifact, k=10, spec_k=8)
==========================================================================================
engine                                       nDCG@10  MRR@10  R@10    encode doc ms  encode q ms  search ms/q  B/chunk  total MB
------------------------------------------------------------------------------------------
brute-force cosine (all-MiniLM-L6-v2)        0.6451   0.6047  0.7833  2514.8         0.1          0.021        -        -
spectraltm K=8 (all-MiniLM-L6-v2)            0.1732   0.1512  0.2569  2514.8         0.1          0.224        24       0.12
brute-force cosine (BAAI/bge-large-en-v1.5)  0.7346   0.7013  0.8592  44604.3        0.5          0.023        -        -
spectraltm K=8 (BAAI/bge-large-en-v1.5)      0.1346   0.1198  0.1973  44604.3        0.5          0.317        26       0.13

Honest framing:

  • Brute-force cosine is the gold-standard reference. Spectral layer quality is a steep function of K. At K=8 you lose ~65% of nDCG; at K=64 you lose ~12%. Pick K based on your quality bar.
  • bytes_per_chunk scales linearly with K (24 B at K=8, 192 B at K=64 for dim=384). Storage cost is real, not just latency.
  • Search latency grows with K (each query bin contributes K column scans). For BEIR scifact (N=5K) it's <3ms at K=64; at N=50K expect 10–30ms with the scalar inner loop, much faster with AVX2.
  • encode doc ms / encode q ms come from the same embedder call — they are not per-engine costs. The engine cost is bytes/chunk and total MB.
  • The right K depends on storage budget and quality bar. K=32 is a reasonable middle ground (96 B/chunk, ~14% nDCG gap).

When to use spectraltm

Use it when:

  • You have a large cold-tier corpus (≥ 100K vectors) and can't keep float32 embeddings online
  • Storage bandwidth dominates query latency (disk-resident index, RAM-constrained)
  • You're willing to trade retrieval quality for ~60× storage compression (24 B/chunk vs 1.5 KB/vector)

Don't use it when:

  • Brute-force cosine fits in memory (small corpus, fast disk) — the quality gap isn't worth the engineering
  • You need exact top-K recall (spectral layer is lossy by design)
  • Latency-critical path requires < 0.1 ms search (use a flat HNSW or IVF index instead)

Project layout

SpectraLTM/
├── src/
│   ├── codes.rs             # SpectralCodes struct (Rust) + quantization
│   ├── encoder.rs           # SpectralEncoder (calibrate + encode)
│   ├── index.rs             # SpectralIndex (inverted mag/phase grids)
│   ├── simd.rs              # AVX2 + scalar inner loop
│   ├── error.rs             # SpectraltmError
│   └── python_bindings.rs   # PyO3 module + SpectralCodes/Index wrappers
├── tests/
│   ├── test_spectraltm_rust_parity.py    # cross-check vs Python reference
│   ├── generate_trace_vectors.py         # corpus fixtures
│   └── test_hypothesis_python_encoded.py # property tests
├── examples/
│   └── beir_scifact_eval.py  # the benchmark that produced the table above
└── pyproject.toml            # maturin build config

How It Works

embeddings ─┐
            ├─► [encoder.calibrate]  ─► quantizer grids (mags, phases)
            │
            ├─► [encoder.encode]      ─► SpectralCodes (idx, mag_q, phase_q, norm_q)
            │
            └─► [index.add_embeddings] ─► dense (N, F) mag + phase grids

queries  ──┬─► [encoder.encode] ─► SpectralCodes (query)
            │
            └─► [index.search]   ─► (N, F) grid lookup ─► top-K scores

Per-frequency-bin scoring is the hot path. SIMD vectorizes the inner loop so a single query bin becomes a packed float32 dot-product over all N corpus entries for that bin. Top-K across bins is a small partial-sort.

Development

# Editable install (rebuilds on Rust changes):
python -m maturin develop --release

# Or build a wheel:
python -m maturin build --release
pip install target/wheels/spectraltm-*.whl

See HOW_TO_INSTALL.md for platform-specific setup (Rust toolchain, Python ABI compatibility, NumPy linkage).

Citation

@misc{mckenzie2026spectral,
  title={Sparse Spectral Encoding for Cold-Tier Vector Memory},
  author={Mc Kenzie, Gerald Enrique Nelson},
  year={2026},
  month={6},
  day={28},
  doi={10.5281/zenodo.21005661},
  howpublished={Harmonic Resonance Indexing research project},
  url={https://github.com/lordxmen2k/sparse-spectral-encoding},
  note={Apache License 2.0. Contact: lordxmen2k@gmail.com}
}

Metadata

License

Apache-2.0 — see LICENSE.


Built as a Rust port of spectral_codes — the two projects share zero source code per the explicit separation rule. The parity test (tests/test_spectraltm_rust_parity.py) is the only cross-project coupling and is the regression check that catches SpectraLTM drifting from the reference algorithm.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spectraltm-0.1.1-cp314-cp314-win_amd64.whl (608.6 kB view details)

Uploaded CPython 3.14Windows x86-64

File details

Details for the file spectraltm-0.1.1-cp314-cp314-win_amd64.whl.

File metadata

File hashes

Hashes for spectraltm-0.1.1-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 b4f5750eba4bafae8dd7be691d1ffed5ff0a3ac842e509c296ac05a004ffce1c
MD5 705d42cfa70484037ea3ec7d94aa5b11
BLAKE2b-256 52c546c4d525111be4772ff894483abbc34c19a70c929d611d21ab70a1e7cfa2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page