Skip to main content

Compressed vector store using PolarQuant — 8-15x smaller float arrays with fast inner product queries

Project description

turboqvec-rs

Compressed vector store for Python, backed by Rust. Store large float arrays at 8–15x smaller footprint and run cosine similarity queries directly on the compressed form — no decompression needed.

Built on PolarQuant (TurboQuant Stage 1, Google Research, ICLR 2026). Rust core with PyO3 bindings and Rayon parallel scan.


Install

pip install turboqvec

Build from source:

git clone https://github.com/Manojython/turboqvec-rs
cd turboqvec-rs
pip install maturin
maturin develop --release

Usage

from turboqvec import CompressedVectorStore

store = CompressedVectorStore(dim=384, bits=4)

# Insert — accepts lists or numpy arrays
store.insert(id=0, vector=my_embedding)
store.insert_batch([(id, vec) for id, vec in pairs])  # parallel, faster

# Query — returns (id, score) tuples sorted by cosine similarity
results = store.query(query_embedding, top_k=10)

# Stats
print(f"{len(store):,} vectors, {store.memory_bytes() / 1e6:.1f} MB compressed")

See notebooks/quickstart.ipynb for a full walkthrough with real embeddings.


Accuracy

Validated on 500 BGE-small-en-v1.5 sentence embeddings (d=384, 20newsgroups).

bits R@1 R@5 R@10 Mean cosine error Compression
4-bit 95% 96% 95% 0.004 7.8x
3-bit 87% 89% 92% 0.011 10.4x
2-bit 73% 81% 85% 0.031 15.4x

Pearson/Spearman correlation of scores vs true cosine: 0.998 at 4-bit.

PolarQuant slightly underestimates cosine similarity (mean drift ≈ -0.003 at 4-bit). Rank order is preserved so top-k retrieval is not affected. For absolute threshold queries (e.g. score > 0.7), add ~0.003 as a calibration offset.


Storage

n vectors fp32 4-bit 3-bit 2-bit
1,000 1.5 MB 0.20 MB 0.15 MB 0.10 MB
10,000 15.4 MB 1.96 MB 1.48 MB 1.00 MB
100,000 153.6 MB 19.6 MB 14.8 MB 10.0 MB
1,000,000 1,536 MB 196 MB 148 MB 100 MB

How it works

Ingestion

Each vector goes through: norm → normalise → rotate (R·v̂) → Lloyd-Max quantize → bit-pack. The rotation matrix R is a fixed random orthogonal matrix (QR decomp) that decorrelates all dimensions so a single codebook fits every coordinate position. Only the norm (4 bytes exact) and the packed bin indices are stored.

Retrieval

For a query q, we build a lookup table once: lut[i][b] = (R·q)[i] × centroid[b]. Then for each database vector, the score is a sum of d table lookups plus one multiply — no matrix math per vector. Rayon parallelises the scan across all N entries.

This is asymmetric distance computation (the same idea FAISS PQ uses). One matrix multiply per query regardless of N.

Why not QJL (Stage 2)?

TurboQuant adds a 1-bit residual sketch for KV cache attention vectors. For normalised sentence embeddings the Stage 1 bias is already near zero — adding QJL introduced noise and dropped Pearson correlation from 0.996 to 0.48. Stage 1 alone is the right primitive for standard embeddings.


Repo layout

turboqvec-rs/
├── src/               Rust source
│   ├── codebook.rs    Lloyd-Max quantizer, global cache
│   ├── rotation.rs    QR decomposition, global cache
│   ├── encode.rs      encode pipeline + bit packing + Rayon batch
│   ├── decode.rs      decode pipeline (for reconstruction)
│   ├── similarity.rs  asymmetric LUT scoring
│   ├── store.rs       CompressedVectorStore
│   └── py_bindings.rs PyO3 wrapper
├── python/turboqvec/  Python package
├── notebooks/         Quickstart notebook
├── benchmarks/        Comparison scripts vs fp32 baseline
└── tests/             Rust integration tests

Related


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

turboqvec-0.1.0.tar.gz (27.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

turboqvec-0.1.0-cp312-cp312-macosx_11_0_arm64.whl (286.8 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

File details

Details for the file turboqvec-0.1.0.tar.gz.

File metadata

  • Download URL: turboqvec-0.1.0.tar.gz
  • Upload date:
  • Size: 27.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.12.4

File hashes

Hashes for turboqvec-0.1.0.tar.gz
Algorithm Hash digest
SHA256 408e75d5f3457a3391482414b2b73e4b69fc42243f17dc26516166d2358c0f79
MD5 a5fea7cf507a68e60ca1d0fb766fcf6d
BLAKE2b-256 63d03a2e48dd76cd9398d829554f6d95b282d1ed54855a1d56645a8a523cfabf

See more details on using hashes here.

File details

Details for the file turboqvec-0.1.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for turboqvec-0.1.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 dc91cb2864012d454a3c96e3932eca3ef624cfa4a8d3ada2b7a490ce18266341
MD5 a352f52b6064ccb8a2e169f53202e94c
BLAKE2b-256 4a5c8002425615d3c458f0b496648c3276c8c35902c03a26d3a8c4c8d69f2d06

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page