Fast embedded vector database with Python bindings
Project description
Quiver
A fast, embedded vector database written in Rust with Python bindings.
No server, no network — runs fully in-process with SIMD-accelerated search.
Installation
pip install quiver-vector-db
Prebuilt wheels for macOS (Apple Silicon / Intel) and Linux (x86_64 / ARM64). Python 3.11+.
Optional extras:
pip install quiver-vector-db[sentence-transformers] # local embeddings
pip install quiver-vector-db[openai] # OpenAI embeddings
pip install quiver-vector-db[all] # everything
Build from source (requires Rust toolchain):
pip install maturin
git clone https://github.com/rhshriva/Quiver.git && cd Quiver
maturin develop --release
Features
| Feature | Description |
|---|---|
| 8 index types | HNSW, Flat, Int8, FP16, IVF, IVF-PQ, Memory-mapped, Binary |
| Built-in embeddings | sentence-transformers (local) and OpenAI API |
| Full-text search | BM25 keyword search with hybrid dense+sparse fusion |
| 3 distance metrics | Cosine, L2, Dot Product — all SIMD-accelerated (AVX2/NEON) |
| Payload filtering | JSON metadata with 9 filter operators ($eq, $ne, $in, $gt, $gte, $lt, $lte, $and, $or) |
| Hybrid search | Weighted fusion of dense vectors + sparse keyword signals |
| Multi-vector | Multiple named embedding spaces per document (text + image) |
| Data versioning | Create, list, restore, and delete collection snapshots |
| Batch upsert | Efficient bulk insertion API |
| Parallel HNSW insert | Multi-threaded insert via rayon micro-batching |
| WAL persistence | Crash-safe writes with automatic compaction |
| IDE support | Full type stubs (py.typed + .pyi) for autocompletion |
| Zero dependencies | Single pip install, no external dependencies |
Performance
Apple M-series (16 cores), 2,000 vectors × 128 dimensions, k=10, release build — insert throughput, single-query search latency, and recall@10 measured with matched configurations (M=16, ef_construction=200, ef_search=50, nlist=32, nprobe=8) across all four libraries.
Insert Throughput (vectors/sec, higher is better)
| Index Type | Quiver | faiss | hnswlib | ChromaDB |
|---|---|---|---|---|
| Flat (brute-force) | 28,269K | 56,140K | not-supported | not-supported |
| HNSW (1 thread) | 20,495 | 16,234 | 9,774 | — |
| HNSW (16 threads) | 59,644 | 123,475 | 64,786 | 29,704 |
| Int8 / SQ8 | 12,276K | 6,069K | not-supported | not-supported |
| IVF-Flat | 198K | 1,299K | not-supported | not-supported |
| IVF-PQ | 72K | 133K | not-supported | not-supported |
| FP16 | 13,691K | not-supported | not-supported | not-supported |
| Mmap | 1,748K | not-supported | not-supported | not-supported |
| Binary | 2,217K | not-supported | not-supported | not-supported |
Search Latency (ms/query, lower is better)
| Index Type | Quiver | faiss | hnswlib | ChromaDB |
|---|---|---|---|---|
| Flat (brute-force) | 0.023 | 0.016 | not-supported | not-supported |
| HNSW | 0.022 | 0.023 | 0.044 | 0.285 |
| Int8 / SQ8 | 0.060 | 0.095 | not-supported | not-supported |
| IVF-Flat | 0.016 | 0.009 | not-supported | not-supported |
| IVF-PQ | 0.014 | 0.008 | not-supported | not-supported |
| FP16 | 0.116 | not-supported | not-supported | not-supported |
| Mmap | 0.032 | not-supported | not-supported | not-supported |
| Binary | 0.013 | not-supported | not-supported | not-supported |
Recall@10 (higher is better)
| Index Type | Quiver | faiss | hnswlib | ChromaDB |
|---|---|---|---|---|
| Flat (brute-force) | 1.0000 | 1.0000 | not-supported | not-supported |
| HNSW | 0.9410 | 0.9510 | 0.9350 | 0.9280 |
| Int8 / SQ8 | 0.9940 | 0.9930 | not-supported | not-supported |
| IVF-Flat | 0.5740 | 0.5670 | not-supported | not-supported |
| IVF-PQ | 0.0720 | 0.0690 | not-supported | not-supported |
| FP16 | 1.0000 | not-supported | not-supported | not-supported |
| Mmap | 1.0000 | not-supported | not-supported | not-supported |
| Binary | 0.1180 | not-supported | not-supported | not-supported |
Highlights:
- HNSW single-thread insert: Quiver 20.5K — 2.1x faster than hnswlib (9.8K), 1.3x faster than faiss (16.2K)
- Int8 insert: Quiver 12.3M — 2x FASTER than faiss SQ8 (6.1M)
- HNSW search: Quiver 0.022ms — 2x faster than hnswlib (0.044ms), 13x faster than ChromaDB (0.285ms)
- Int8 search: Quiver 0.060ms — 1.6x faster than faiss SQ8 (0.095ms)
- Recall parity: Quiver matches or exceeds competitors across all index types
- 3 unique index types (FP16, Mmap, Binary) not available in any competitor
Reproduce: pytest tests/test_competitive_benchmarks.py -v -s
Quick Start
1. Persistent Collection (WAL-backed)
import quiver_vector_db as quiver
# Open a database (creates directory if needed)
db = quiver.Client(path="./my_data")
col = db.create_collection("docs", dimensions=384, metric="cosine")
# Insert vectors with metadata
col.upsert(id=1, vector=[0.12, 0.45, ...], payload={"title": "Hello world"})
col.upsert(id=2, vector=[0.98, 0.01, ...], payload={"title": "Vector search"})
# Search
hits = col.search(query=[0.13, 0.44, ...], k=5)
for hit in hits:
print(hit["id"], hit["distance"], hit["payload"])
Reopen the same path later and all collections are restored automatically.
2. Text Search with Built-in Embeddings
import quiver_vector_db as quiver
from quiver_vector_db import TextCollection, SentenceTransformerEmbedding
db = quiver.Client(path="./my_data")
col = db.create_collection("articles", dimensions=384, metric="cosine")
# Wrap with automatic embedding + BM25 full-text indexing
text_col = TextCollection(col, SentenceTransformerEmbedding("all-MiniLM-L6-v2"))
# Add documents by text — embedding is handled automatically
text_col.add(ids=[1, 2, 3], documents=[
"Introduction to machine learning",
"Advanced deep learning techniques",
"Cooking recipes from around the world",
])
# Hybrid search (semantic + keyword) — default mode
hits = text_col.query("neural network basics", k=5)
# Semantic only (dense vector similarity)
hits = text_col.query("neural networks", k=5, mode="semantic")
# Keyword only (BM25 full-text)
hits = text_col.query("machine learning", k=5, mode="keyword")
for hit in hits:
print(hit["id"], hit["document"], hit.get("score") or hit.get("distance"))
Requires: pip install quiver-vector-db[sentence-transformers]
3. Filtered Search with Metadata
col.upsert(id=1, vector=[...], payload={"category": "tech", "rating": 4.8})
col.upsert(id=2, vector=[...], payload={"category": "science", "rating": 3.2})
col.upsert(id=3, vector=[...], payload={"category": "tech", "rating": 4.1})
# Simple equality filter
hits = col.search(query=[...], k=5, filter={"category": {"$eq": "tech"}})
# Range filter
hits = col.search(query=[...], k=5, filter={"rating": {"$gte": 4.0}})
# Compound filter with $and / $or
hits = col.search(query=[...], k=5, filter={
"$and": [
{"category": {"$in": ["tech", "science"]}},
{"rating": {"$gte": 4.0}},
]
})
4. Hybrid Dense + Sparse Search
# Upsert with sparse vector (e.g., BM25 or SPLADE weights)
col.upsert_hybrid(
id=1, vector=[0.12, 0.45, ...],
sparse_vector={42: 0.8, 100: 0.5, 3001: 0.3},
payload={"title": "Rust guide"},
)
# Hybrid search — weighted fusion of dense + sparse signals
hits = col.search_hybrid(
dense_query=[0.13, 0.44, ...],
sparse_query={42: 0.7, 100: 0.6},
k=10,
dense_weight=0.7,
sparse_weight=0.3,
)
for hit in hits:
print(hit["id"], hit["score"], hit["dense_distance"], hit["sparse_score"])
5. Batch Upsert
# Single batch call — more efficient than a loop
col.upsert_batch([
(1, [0.1, 0.2, ...], {"title": "Doc A"}),
(2, [0.3, 0.4, ...], {"title": "Doc B"}),
(3, [0.5, 0.6, ...]), # payload is optional
])
6. Multi-Vector / Multi-Modal Search
from quiver_vector_db import MultiVectorCollection
multi = MultiVectorCollection(
client=db,
name="products",
vector_spaces={
"text": {"dimensions": 384, "metric": "cosine"},
"image": {"dimensions": 512, "metric": "cosine"},
},
)
# Upsert with vectors from different modalities
multi.upsert(id=1, vectors={
"text": [0.1, 0.2, ...],
"image": [0.5, 0.6, ...],
}, payload={"title": "Blue T-Shirt"})
# Search a single space
hits = multi.search(vector_space="text", query=[0.1, 0.2, ...], k=5)
# Cross-modal fusion search with custom weights
hits = multi.search_multi(
queries={"text": [0.1, 0.2, ...], "image": [0.5, 0.6, ...]},
k=5,
weights={"text": 0.6, "image": 0.4},
)
7. Data Versioning / Snapshots
# Insert initial data
for i in range(1000):
col.upsert(id=i, vector=[...], payload={"version": 1})
# Create a snapshot
snapshot = col.create_snapshot("v1")
print(snapshot) # {"name": "v1", "vector_count": 1000, ...}
# Mutate data...
for i in range(1000, 2000):
col.upsert(id=i, vector=[...])
# Roll back to v1
col.restore_snapshot("v1")
assert col.count == 1000 # back to original state
# Manage snapshots
snapshots = col.list_snapshots()
col.delete_snapshot("v1")
8. Standalone Index (No WAL)
import numpy as np
import quiver_vector_db as quiver
# In-memory HNSW index — no persistence overhead
idx = quiver.HnswIndex(dimensions=128, metric="l2", m=16, ef_construction=200)
# Batch insert from numpy array (fastest path)
vectors = np.random.randn(10_000, 128).astype(np.float32)
idx.add_batch_np(vectors)
idx.flush()
# Or parallel insert for maximum throughput
idx2 = quiver.HnswIndex(dimensions=128, metric="l2")
idx2.add_batch_parallel(vectors, num_threads=8)
idx2.flush()
# Search
results = idx.search(query=vectors[0].tolist(), k=10)
# Save and load
idx.save("my_index.qvec")
idx_loaded = quiver.HnswIndex.load("my_index.qvec")
9. REST API Server
# Start server programmatically
from quiver_vector_db.server import create_server
server = create_server(host="0.0.0.0", port=8080, data_path="./my_data")
server.serve_forever()
Or from the command line:
python -m quiver_vector_db.server --port 8080 --data ./my_data
Endpoints:
| Method | Path | Description |
|---|---|---|
GET |
/healthz |
Health check |
GET |
/collections |
List collections |
POST |
/collections |
Create collection ({"name", "dimensions", "metric"}) |
DELETE |
/collections/{name} |
Delete collection |
POST |
/collections/{name}/upsert |
Upsert vector ({"id", "vector", "payload"}) |
POST |
/collections/{name}/upsert_batch |
Batch upsert ({"entries": [...]}) |
POST |
/collections/{name}/search |
Search vectors ({"query", "k", "filter"}) |
POST |
/collections/{name}/delete |
Delete vector ({"id"}) |
GET |
/collections/{name}/count |
Get vector count |
POST |
/collections/{name}/snapshots |
Create snapshot |
GET |
/collections/{name}/snapshots |
List snapshots |
POST |
/collections/{name}/snapshots/restore |
Restore snapshot |
DELETE |
/collections/{name}/snapshots/{snap} |
Delete snapshot |
10. BM25 Standalone
from quiver_vector_db import BM25
bm25 = BM25(k1=1.5, b=0.75)
# Index documents
bm25.index_document(0, "the quick brown fox jumps over the lazy dog")
bm25.index_document(1, "machine learning with neural networks")
bm25.index_document(2, "the fox and the hound")
# Generate sparse query vector
sparse_query = bm25.encode_query("quick fox")
print(sparse_query) # {dim_id: idf_weight, ...}
# Save and load
bm25.save("bm25_state.json")
bm25_loaded = BM25.load("bm25_state.json")
Index Types
Eight index types, all usable via Client or standalone:
| Index | Type | Recall | RAM | Best For |
|---|---|---|---|---|
hnsw |
HnswIndex |
95-99% | Vectors + graph | General purpose (default) |
flat |
FlatIndex |
100% | All vectors (f32) | Small datasets, exact results |
quantized_flat |
QuantizedFlatIndex |
~99% | ~4x less (int8) | Memory-constrained exact search |
fp16_flat |
Fp16FlatIndex |
>99.5% | ~2x less (float16) | Balanced memory vs accuracy |
ivf |
IvfIndex |
Tunable | Vectors + centroids | Large datasets |
ivf_pq |
IvfPqIndex |
~90%+ | ~96x less (PQ codes) | Million-scale, extreme compression |
mmap_flat |
MmapFlatIndex |
100% | Near-zero RSS | Dataset larger than RAM |
binary_flat |
BinaryFlatIndex |
Low | ~32x less (1-bit) | Candidate generation, re-ranking |
import quiver_vector_db as quiver
# Via Client (WAL-persisted)
db = quiver.Client(path="./data")
col = db.create_collection("docs", dimensions=768, metric="cosine", index_type="hnsw")
# Standalone in-memory
idx = quiver.FlatIndex(dimensions=384, metric="cosine")
idx = quiver.HnswIndex(dimensions=384, metric="cosine", ef_construction=200, ef_search=50, m=16)
idx = quiver.QuantizedFlatIndex(dimensions=384, metric="cosine")
idx = quiver.Fp16FlatIndex(dimensions=384, metric="cosine")
idx = quiver.IvfIndex(dimensions=384, metric="l2", n_lists=256, nprobe=16, train_size=4096)
idx = quiver.IvfPqIndex(dimensions=384, metric="l2", n_lists=256, nprobe=16, pq_m=8, pq_k_sub=256)
idx = quiver.MmapFlatIndex(dimensions=384, metric="cosine", path="./vectors.qvec")
idx = quiver.BinaryFlatIndex(dimensions=384, metric="l2")
Distance Metrics
| Metric | String | Use When |
|---|---|---|
| Cosine | "cosine" |
Text/image embeddings (most common) |
| L2 (Euclidean) | "l2" |
Geometry, sensor data |
| Dot Product | "dot_product" |
Pre-normalized vectors |
All metrics use SIMD-accelerated kernels (AVX2+FMA on x86, NEON on ARM).
IDE Support
Quiver ships with py.typed and .pyi type stubs. Autocompletion, type checking, and inline docs work out of the box in VSCode, PyCharm, and any editor supporting PEP 561.
Development
Prerequisites
- Rust toolchain (
rustup,cargo) - Python 3.11+
Setup
git clone https://github.com/rhshriva/Quiver.git && cd Quiver
python3 -m venv .venv && source .venv/bin/activate
pip install maturin pytest numpy
maturin develop --release
Running Tests
# Rust tests (~190 tests)
cargo test --workspace
# Python functional tests (~170 tests)
pytest tests/ -v --ignore=tests/test_perf.py --ignore=tests/test_benchmark.py --ignore=tests/test_perf_regression.py
# Performance benchmarks with apple-to-apple comparisons
pytest tests/test_perf_regression.py -v -s
License
Apache 2.0 — see LICENSE for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file quiver_vector_db-0.1.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: quiver_vector_db-0.1.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 659.0 kB
- Tags: CPython 3.11+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
65d1890cc1c762ad073dc3a7d518f733d1bc11f64d382e76e39a67f9fbe139bc
|
|
| MD5 |
7d7b6a25bc11bd39d7e659d15da4040f
|
|
| BLAKE2b-256 |
91e6556a3e409e9e491371dbdf67e3fcc370387656e9ac875cae639fba54acf6
|
Provenance
The following attestation bundles were made for quiver_vector_db-0.1.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
publish.yml on rhshriva/Quiver
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
quiver_vector_db-0.1.2-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
65d1890cc1c762ad073dc3a7d518f733d1bc11f64d382e76e39a67f9fbe139bc - Sigstore transparency entry: 1101033486
- Sigstore integration time:
-
Permalink:
rhshriva/Quiver@ec3b7455197f421ad8365175059ff4c7bbbfc3f0 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/rhshriva
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ec3b7455197f421ad8365175059ff4c7bbbfc3f0 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file quiver_vector_db-0.1.2-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: quiver_vector_db-0.1.2-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 619.0 kB
- Tags: CPython 3.11+, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d855127ca57aedbfd8b0eec6d5f804131f442c2bed9b206f5f7330516e3efe1
|
|
| MD5 |
117d14bbcee7977e0b503aef2aab3f79
|
|
| BLAKE2b-256 |
16f3284795ce4e3976809b64f3c4c6c7e12abeb537b9e47f4c36dee5356e9ab6
|
Provenance
The following attestation bundles were made for quiver_vector_db-0.1.2-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:
Publisher:
publish.yml on rhshriva/Quiver
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
quiver_vector_db-0.1.2-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl -
Subject digest:
8d855127ca57aedbfd8b0eec6d5f804131f442c2bed9b206f5f7330516e3efe1 - Sigstore transparency entry: 1101033522
- Sigstore integration time:
-
Permalink:
rhshriva/Quiver@ec3b7455197f421ad8365175059ff4c7bbbfc3f0 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/rhshriva
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ec3b7455197f421ad8365175059ff4c7bbbfc3f0 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file quiver_vector_db-0.1.2-cp311-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: quiver_vector_db-0.1.2-cp311-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 620.1 kB
- Tags: CPython 3.11+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e96f1e0545bc7ce320f93c491f2864395853189ccf3c3c73cc0d7c16dd1d4864
|
|
| MD5 |
a3191d804a8dd6e8f96e114d99f87e90
|
|
| BLAKE2b-256 |
e4c47792f3c495d593167d4d47291015bd4ab16858c2eb4e7f7dd52a5b230b8f
|
Provenance
The following attestation bundles were made for quiver_vector_db-0.1.2-cp311-abi3-macosx_11_0_arm64.whl:
Publisher:
publish.yml on rhshriva/Quiver
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
quiver_vector_db-0.1.2-cp311-abi3-macosx_11_0_arm64.whl -
Subject digest:
e96f1e0545bc7ce320f93c491f2864395853189ccf3c3c73cc0d7c16dd1d4864 - Sigstore transparency entry: 1101033511
- Sigstore integration time:
-
Permalink:
rhshriva/Quiver@ec3b7455197f421ad8365175059ff4c7bbbfc3f0 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/rhshriva
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ec3b7455197f421ad8365175059ff4c7bbbfc3f0 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file quiver_vector_db-0.1.2-cp311-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: quiver_vector_db-0.1.2-cp311-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 664.6 kB
- Tags: CPython 3.11+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6471c9d980e14cebfd7c778806e8301ec87c893232a3c3db3953eb9dd2fe52e4
|
|
| MD5 |
3ac51c46e9abe14d7063afeab45d5903
|
|
| BLAKE2b-256 |
a90922b4875e13015151c18e138f6d0c7c42b936f03e5d4beeb7caca408d2777
|
Provenance
The following attestation bundles were made for quiver_vector_db-0.1.2-cp311-abi3-macosx_10_12_x86_64.whl:
Publisher:
publish.yml on rhshriva/Quiver
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
quiver_vector_db-0.1.2-cp311-abi3-macosx_10_12_x86_64.whl -
Subject digest:
6471c9d980e14cebfd7c778806e8301ec87c893232a3c3db3953eb9dd2fe52e4 - Sigstore transparency entry: 1101033503
- Sigstore integration time:
-
Permalink:
rhshriva/Quiver@ec3b7455197f421ad8365175059ff4c7bbbfc3f0 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/rhshriva
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ec3b7455197f421ad8365175059ff4c7bbbfc3f0 -
Trigger Event:
workflow_dispatch
-
Statement type: