Compressed vector store using PolarQuant — 8-15x smaller float arrays with fast inner product queries
Project description
turboqvec-rs
Compressed vector store for Python, backed by Rust. Store large float arrays at 8–15x smaller footprint and run cosine similarity queries directly on the compressed form — no decompression needed.
Built on PolarQuant (TurboQuant Stage 1, Google Research, ICLR 2026). Rust core with PyO3 bindings and Rayon parallel scan.
Install
pip install turboqvec
Build from source:
git clone https://github.com/Manojython/turboqvec-rs
cd turboqvec-rs
pip install maturin
maturin develop --release
Usage
from turboqvec import CompressedVectorStore
store = CompressedVectorStore(dim=384, bits=4)
# Insert — accepts lists or numpy arrays
store.insert(id=0, vector=my_embedding)
store.insert_batch([(id, vec) for id, vec in pairs]) # parallel, faster
# Query — returns (id, score) tuples sorted by cosine similarity
results = store.query(query_embedding, top_k=10)
# Stats
print(f"{len(store):,} vectors, {store.memory_bytes() / 1e6:.1f} MB compressed")
See notebooks/quickstart.ipynb for a full walkthrough with real embeddings.
Accuracy
Validated on 500 BGE-small-en-v1.5 sentence embeddings (d=384, 20newsgroups).
| bits | R@1 | R@5 | R@10 | Mean cosine error | Compression |
|---|---|---|---|---|---|
| 4-bit | 95% | 96% | 95% | 0.004 | 7.8x |
| 3-bit | 87% | 89% | 92% | 0.011 | 10.4x |
| 2-bit | 73% | 81% | 85% | 0.031 | 15.4x |
Pearson/Spearman correlation of scores vs true cosine: 0.998 at 4-bit.
PolarQuant slightly underestimates cosine similarity (mean drift ≈ -0.003 at 4-bit). Rank order is preserved so top-k retrieval is not affected. For absolute threshold queries (e.g. score > 0.7), add ~0.003 as a calibration offset.
Storage
| n vectors | fp32 | 4-bit | 3-bit | 2-bit |
|---|---|---|---|---|
| 1,000 | 1.5 MB | 0.20 MB | 0.15 MB | 0.10 MB |
| 10,000 | 15.4 MB | 1.96 MB | 1.48 MB | 1.00 MB |
| 100,000 | 153.6 MB | 19.6 MB | 14.8 MB | 10.0 MB |
| 1,000,000 | 1,536 MB | 196 MB | 148 MB | 100 MB |
How it works
Ingestion
Each vector goes through: norm → normalise → rotate (R·v̂) → Lloyd-Max quantize → bit-pack. The rotation matrix R is a fixed random orthogonal matrix (QR decomp) that decorrelates all dimensions so a single codebook fits every coordinate position. Only the norm (4 bytes exact) and the packed bin indices are stored.
Retrieval
For a query q, we build a lookup table once: lut[i][b] = (R·q)[i] × centroid[b]. Then for each database vector, the score is a sum of d table lookups plus one multiply — no matrix math per vector. Rayon parallelises the scan across all N entries.
This is asymmetric distance computation (the same idea FAISS PQ uses). One matrix multiply per query regardless of N.
Why not QJL (Stage 2)?
TurboQuant adds a 1-bit residual sketch for KV cache attention vectors. For normalised sentence embeddings the Stage 1 bias is already near zero — adding QJL introduced noise and dropped Pearson correlation from 0.996 to 0.48. Stage 1 alone is the right primitive for standard embeddings.
Repo layout
turboqvec-rs/
├── src/ Rust source
│ ├── codebook.rs Lloyd-Max quantizer, global cache
│ ├── rotation.rs QR decomposition, global cache
│ ├── encode.rs encode pipeline + bit packing + Rayon batch
│ ├── decode.rs decode pipeline (for reconstruction)
│ ├── similarity.rs asymmetric LUT scoring
│ ├── store.rs CompressedVectorStore
│ └── py_bindings.rs PyO3 wrapper
├── python/turboqvec/ Python package
├── notebooks/ Quickstart notebook
├── benchmarks/ Comparison scripts vs fp32 baseline
└── tests/ Rust integration tests
Related
- fastrustrag — MinHash deduplication for RAG pipelines
- pageindex-rs — Hierarchical document indexing for LLMs
- TurboQuant paper — Google Research, ICLR 2026
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file turboqvec-0.1.0.tar.gz.
File metadata
- Download URL: turboqvec-0.1.0.tar.gz
- Upload date:
- Size: 27.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
408e75d5f3457a3391482414b2b73e4b69fc42243f17dc26516166d2358c0f79
|
|
| MD5 |
a5fea7cf507a68e60ca1d0fb766fcf6d
|
|
| BLAKE2b-256 |
63d03a2e48dd76cd9398d829554f6d95b282d1ed54855a1d56645a8a523cfabf
|
File details
Details for the file turboqvec-0.1.0-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: turboqvec-0.1.0-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 286.8 kB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dc91cb2864012d454a3c96e3932eca3ef624cfa4a8d3ada2b7a490ce18266341
|
|
| MD5 |
a352f52b6064ccb8a2e169f53202e94c
|
|
| BLAKE2b-256 |
4a5c8002425615d3c458f0b496648c3276c8c35902c03a26d3a8c4c8d69f2d06
|