Compress and protect embeddings with TurboQuant. Zero-loss privacy via orthogonal rotation + 8x compression. No training needed.
Project description
turboquant-vectors
Compress and protect embeddings with TurboQuant.
Two tools in one package:
- PrivateEncoder -- rotate embeddings with a secret key. Search works identically. Inversion attacks fail.
- compress/search -- 8x compression, no training needed, instant.
from turboquant_vectors import PrivateEncoder
encoder = PrivateEncoder.generate(dim=1536)
rotated = encoder.rotate(embeddings) # search works identically
encoder.save_key("secret.tqkey") # treat like an SSH key
Embedding Privacy
Vec2Text recovers 92% of original text from unprotected embeddings (32-token inputs, GTR-base encoder). ALGEN needs only 1,000 leaked pairs. OWASP lists this as LLM08 in their 2025 Top 10.
PrivateEncoder applies a secret orthogonal rotation before you send embeddings to a third-party vector DB. The math:
<Qx, Qy> = x^T Q^T Q y = x^T y = <x, y>
Cosine similarity, L2 distance, inner product -- all preserved exactly (up to float32 precision, ~1e-6 error).
Quick start
from turboquant_vectors import PrivateEncoder
import numpy as np
# Generate a secret key (uses OS entropy)
encoder = PrivateEncoder.generate(dim=1536)
encoder.save_key("secret.tqkey")
# Rotate before uploading to Pinecone/Weaviate/Qdrant
rotated = encoder.rotate(embeddings)
# pinecone_index.upsert(vectors=rotated.tolist(), ids=ids)
# Rotate query too (same key)
rotated_query = encoder.rotate(query)
# results = pinecone_index.query(vector=rotated_query.tolist(), top_k=10)
# Later, load the same key
encoder = PrivateEncoder.load_key("secret.tqkey")
What it protects against
- Vec2Text (92% text recovery from embeddings) -- fails completely on rotated vectors
- ALGEN (few-shot inversion with 1K pairs) -- fails without the rotation key
- ZSinvert / Zero2Text (zero-shot inversion) -- fails on rotated embedding space
- Attribute classifiers (age, sex, medical conditions from embeddings) -- drop to random chance
Our test suite proves it: a classifier trained on original embeddings achieves >80% accuracy, but drops to <35% (near random chance) on rotated vectors from the same data.
What it does NOT protect against
Be honest about the threat model:
- Known-plaintext attack: d original-rotated pairs (e.g., 1,536 for OpenAI embeddings) fully recovers the key via SVD. Don't let anyone see both the original AND rotated versions of the same content.
- Pairwise distances are visible: The server can see which documents are similar to each other, cluster structure, and query patterns. It just can't read what any document says.
- Key compromise: If the key file leaks, all rotated vectors are trivially recoverable.
- RAG output attacks: Membership inference via LLM output is not mitigated.
What it is NOT
- Not encryption in the cryptographic sense
- Not differential privacy (no epsilon-delta guarantee)
- Not a substitute for access control on the vector database
Threat model: honest-but-curious vector DB provider who sees only rotated vectors and has no access to your original texts or the rotation key.
What the server CAN learn
Even with rotation, the server can observe:
- Cluster structure (how many topics exist)
- Document similarity graph (which docs are related)
- Query patterns (which clusters you search most)
- Duplicate/near-duplicate documents
- Temporal patterns (when documents are added)
The server CANNOT determine what any document says, infer PII, or run published inversion attacks.
Comparison with other approaches
| Property | Rotation (ours) | Differential Privacy | Homomorphic Encryption | IronCore Cloaked AI |
|---|---|---|---|---|
| Search quality | Identical (lossless) | 5-30% recall loss | Identical | ~5% recall loss |
| Latency overhead | <0.1ms per vector | Negligible | 1000-10000x | SDK overhead |
| Deployment | One numpy matmul | Drop-in | Custom server | SDK + license |
| License | Apache 2.0 | N/A | N/A | AGPL / $599+/mo |
| Known-plaintext resistant | No (d pairs breaks it) | Yes | Yes | Partially |
Key management
Treat .tqkey files like SSH private keys:
- Don't commit to git (add
*.tqkeyto .gitignore) - Back up securely -- if lost, you can't unrotate (search still works)
- Use
from_seed()with a 128-bit seed to share keys without large files - Use
rekey_vectors()to rotate to a new key without exposing originals
Benchmarks
| Dimension | Single vector | Batch 10K | Key generation | Key file |
|---|---|---|---|---|
| 384 | 0.03 ms | 8.7 ms | 31 ms | 0.6 MB |
| 768 | 0.06 ms | 25 ms | 141 ms | 2.4 MB |
| 1536 | 0.11 ms | 88 ms | 465 ms | 9.4 MB |
Integration examples
Works with any vector DB that accepts float arrays:
# Pinecone
rotated = encoder.rotate(embeddings)
index.upsert(vectors=[(id, vec.tolist(), meta) for id, vec, meta in zip(ids, rotated, metadata)])
# ChromaDB
collection.add(embeddings=encoder.rotate(embeddings).tolist(), ids=ids)
# LangChain (wrap any embedding model)
class PrivateEmbeddings(Embeddings):
def __init__(self, base, encoder):
self.base, self.encoder = base, encoder
def embed_documents(self, texts):
return self.encoder.rotate(np.array(self.base.embed_documents(texts))).tolist()
def embed_query(self, text):
return self.encoder.rotate(np.array(self.base.embed_query(text))).tolist()
# sentence-transformers
embeddings = model.encode(texts)
rotated = encoder.rotate(embeddings)
Privacy + compression
Combine both: rotate for privacy, then quantize for 8x compression.
compressed = encoder.rotate_and_compress(embeddings, bits=4)
idx, scores = compressed.search(encoder.rotate(query), top_k=10)
compressed.save("private_index.npz")
Compression
8x instant compression, no training needed.
First open-source implementation of Google's TurboQuant (ICLR 2026) for vector search.
from turboquant_vectors import compress, search
compressed = compress(embeddings, bits=4) # 307 MB -> 38 MB
indices, scores = search(compressed, query, top_k=10)
Why
FAISS Product Quantization requires k-means training per dataset. TurboQuant is instant (data-oblivious), compresses 2-2.5x faster, and gets up to +8pp better recall at the same storage budget.
Benchmarks (50K vectors, 1536-dim)
| Budget | TurboQuant | FAISS PQ | Delta | Compress Time |
|---|---|---|---|---|
| 2-bit (384 B/vec) | 52.8% | 45.7% | +7.1pp | 3.8s vs 8.5s |
| 4-bit (768 B/vec) | 83.8% | 75.8% | +8.0pp | 6.5s vs 16.0s |
Install
pip install turboquant-vectors
Requires only numpy. No torch, no scipy for the privacy module.
Full API
PrivateEncoder
PrivateEncoder.generate(dim) # New key from OS entropy
PrivateEncoder.from_seed(dim, seed) # Deterministic key (seed >= 2^64)
PrivateEncoder.load_key(path) # Load from .tqkey file
encoder.rotate(vectors) # Apply rotation
encoder.unrotate(vectors) # Reverse rotation (needs key)
encoder.save_key(path) # Save to .tqkey file
encoder.fingerprint() # 16-char hex key ID
encoder.rekey_vectors(vecs, old_enc) # Switch keys without unrotating
encoder.rotate_and_compress(vecs, 4) # Privacy + compression
encoder.make_canary() / verify_canary() # Key verification without originals
Compression
compress(vectors, bits=4) # Compress vectors
decompress(compressed) # Restore to float32
search(compressed, query, top_k=10) # Search compressed vectors
compressed.save(path) / .load(path) # Persistence
Paper
TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate Zandieh, Daliri, Hadian, Mirrokni (Google Research) ICLR 2026 | arXiv:2504.19874
Independent implementation, not affiliated with Google Research.
License
Apache 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file turboquant_vectors-0.2.2.tar.gz.
File metadata
- Download URL: turboquant_vectors-0.2.2.tar.gz
- Upload date:
- Size: 29.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
10c50aec8f1d36eda6cf3d729b8f744cd394741f61eb1f0491d3bc14974caa09
|
|
| MD5 |
1e4ebd036a896eafe79b9d83adab5a47
|
|
| BLAKE2b-256 |
08e65aa8c7bb0a8a81307c012fca8bb4ad4ad525ba90b1d60bed35e6e15aea4c
|
File details
Details for the file turboquant_vectors-0.2.2-py3-none-any.whl.
File metadata
- Download URL: turboquant_vectors-0.2.2-py3-none-any.whl
- Upload date:
- Size: 18.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
67d0bdf21a345026c5bb9c49b573ae8cb66dba4198bbe4acd4e2b3f05a811701
|
|
| MD5 |
6e1ffc0afa23b70fa2b8ee8a29eda9e7
|
|
| BLAKE2b-256 |
1b98cb31e86a8cb152b2ec48b4543419a39b3b43d1681f6721efbf21415947a4
|