Fast embedded vector database with Python bindings

These details have not been verified by PyPI

Project links

Project description

Quiver

A fast, embedded vector database written in Rust with Python bindings.
No server, no network — runs fully in-process with SIMD-accelerated search.

Installation

pip install quiver-vector-db

Prebuilt wheels for macOS (Apple Silicon / Intel) and Linux (x86_64 / ARM64). Python 3.11+.

Optional extras:

pip install quiver-vector-db[sentence-transformers]   # local embeddings
pip install quiver-vector-db[openai]                   # OpenAI embeddings
pip install quiver-vector-db[all]                      # everything

Build from source (requires Rust toolchain):

pip install maturin
git clone https://github.com/rhshriva/Quiver.git && cd Quiver
maturin develop --release

Features

Feature	Description
8 index types	HNSW, Flat, Int8, FP16, IVF, IVF-PQ, Memory-mapped, Binary
Built-in embeddings	sentence-transformers (local) and OpenAI API
Full-text search	BM25 keyword search with hybrid dense+sparse fusion
3 distance metrics	Cosine, L2, Dot Product — all SIMD-accelerated (AVX2/NEON)
Payload filtering	JSON metadata with 8 operators (`$eq`, `$ne`, `$in`, `$gt`, `$gte`, `$lt`, `$lte`, `$and`, `$or`)
Hybrid search	Weighted fusion of dense vectors + sparse keyword signals
Multi-vector	Multiple named embedding spaces per document (text + image)
Data versioning	Create, list, restore, and delete collection snapshots
Batch upsert	Efficient bulk insertion API
Parallel HNSW insert	Multi-threaded insert via rayon micro-batching
WAL persistence	Crash-safe writes with automatic compaction
REST API server	Zero-dependency HTTP server for language-agnostic access
IDE support	Full type stubs (`py.typed` + `.pyi`) for autocompletion
Zero dependencies	Single `pip install`, no servers or runtimes

Performance

Benchmarked on Apple M-series (16 cores), 2,000 vectors, 128 dimensions, k=10, release build. All comparisons use identical configurations (M=16, ef_construction=200, ef_search=50, nlist=32, nprobe=8).

Insert Throughput, Search Latency & Recall — All Index Types

Index	Insert (vec/s)	Search (ms)	Recall@10
Quiver Flat	1,440K	0.027	1.0000
faiss Flat	59,480K	0.016	1.0000

Quiver HNSW (1T)	21,163	0.020	0.9440
Quiver HNSW (16T)	61,558	0.021	0.9410
hnswlib (1T)	10,036	0.045	0.9350
hnswlib (16T)	67,149	0.045	0.9350
faiss HNSW (1T)	16,122	0.024	0.9510
faiss HNSW (16T)	117,512	0.024	0.9510

Quiver Int8	1,294K	0.060	0.9940
faiss SQ8	5,795K	0.094	0.9930

Quiver IVF	174K	0.015	0.5390
faiss IVFFlat	1,287K	0.009	0.5670

Quiver IVF-PQ	60K	0.015	0.0790
faiss IVFPQ	134K	0.008	0.0690

Quiver FP16	1,403K	0.115	1.0000
Quiver Mmap	816K	0.032	1.0000
Quiver Binary	873K	0.013	0.1190

Highlights:

HNSW search: Quiver is the fastest — 0.020ms vs hnswlib 0.045ms (2.3x) vs faiss 0.024ms (1.2x)
HNSW insert (1T): Quiver 21K vs hnswlib 10K (2.1x) vs faiss 16K (1.3x faster)
HNSW insert (16T): Quiver 62K — parallel micro-batching gives 3x single-thread speedup
Int8 search: Quiver 0.060ms vs faiss SQ8 0.094ms (1.6x faster)
Recall parity: Quiver matches or exceeds competitors across all index types

Reproduce: pytest tests/test_perf_regression.py::TestAppleToAppleComparison -v -s

Quick Start

1. Persistent Collection (WAL-backed)

import quiver_vector_db as quiver

# Open a database (creates directory if needed)
db  = quiver.Client(path="./my_data")
col = db.create_collection("docs", dimensions=384, metric="cosine")

# Insert vectors with metadata
col.upsert(id=1, vector=[0.12, 0.45, ...], payload={"title": "Hello world"})
col.upsert(id=2, vector=[0.98, 0.01, ...], payload={"title": "Vector search"})

# Search
hits = col.search(query=[0.13, 0.44, ...], k=5)
for hit in hits:
    print(hit["id"], hit["distance"], hit["payload"])

Reopen the same path later and all collections are restored automatically.

2. Text Search with Built-in Embeddings

import quiver_vector_db as quiver
from quiver_vector_db import TextCollection, SentenceTransformerEmbedding

db  = quiver.Client(path="./my_data")
col = db.create_collection("articles", dimensions=384, metric="cosine")

# Wrap with automatic embedding + BM25 full-text indexing
text_col = TextCollection(col, SentenceTransformerEmbedding("all-MiniLM-L6-v2"))

# Add documents by text — embedding is handled automatically
text_col.add(ids=[1, 2, 3], documents=[
    "Introduction to machine learning",
    "Advanced deep learning techniques",
    "Cooking recipes from around the world",
])

# Hybrid search (semantic + keyword) — default mode
hits = text_col.query("neural network basics", k=5)

# Semantic only (dense vector similarity)
hits = text_col.query("neural networks", k=5, mode="semantic")

# Keyword only (BM25 full-text)
hits = text_col.query("machine learning", k=5, mode="keyword")

for hit in hits:
    print(hit["id"], hit["document"], hit.get("score") or hit.get("distance"))

Requires: pip install quiver-vector-db[sentence-transformers]

3. Filtered Search with Metadata

col.upsert(id=1, vector=[...], payload={"category": "tech", "rating": 4.8})
col.upsert(id=2, vector=[...], payload={"category": "science", "rating": 3.2})
col.upsert(id=3, vector=[...], payload={"category": "tech", "rating": 4.1})

# Simple equality filter
hits = col.search(query=[...], k=5, filter={"category": {"$eq": "tech"}})

# Range filter
hits = col.search(query=[...], k=5, filter={"rating": {"$gte": 4.0}})

# Compound filter with $and / $or
hits = col.search(query=[...], k=5, filter={
    "$and": [
        {"category": {"$in": ["tech", "science"]}},
        {"rating": {"$gte": 4.0}},
    ]
})

4. Hybrid Dense + Sparse Search

# Upsert with sparse vector (e.g., BM25 or SPLADE weights)
col.upsert_hybrid(
    id=1, vector=[0.12, 0.45, ...],
    sparse_vector={42: 0.8, 100: 0.5, 3001: 0.3},
    payload={"title": "Rust guide"},
)

# Hybrid search — weighted fusion of dense + sparse signals
hits = col.search_hybrid(
    dense_query=[0.13, 0.44, ...],
    sparse_query={42: 0.7, 100: 0.6},
    k=10,
    dense_weight=0.7,
    sparse_weight=0.3,
)

for hit in hits:
    print(hit["id"], hit["score"], hit["dense_distance"], hit["sparse_score"])

5. Batch Upsert

# Single batch call — more efficient than a loop
col.upsert_batch([
    (1, [0.1, 0.2, ...], {"title": "Doc A"}),
    (2, [0.3, 0.4, ...], {"title": "Doc B"}),
    (3, [0.5, 0.6, ...]),  # payload is optional
])

6. Multi-Vector / Multi-Modal Search

from quiver_vector_db import MultiVectorCollection

multi = MultiVectorCollection(
    client=db,
    name="products",
    vector_spaces={
        "text":  {"dimensions": 384, "metric": "cosine"},
        "image": {"dimensions": 512, "metric": "cosine"},
    },
)

# Upsert with vectors from different modalities
multi.upsert(id=1, vectors={
    "text":  [0.1, 0.2, ...],
    "image": [0.5, 0.6, ...],
}, payload={"title": "Blue T-Shirt"})

# Search a single space
hits = multi.search(vector_space="text", query=[0.1, 0.2, ...], k=5)

# Cross-modal fusion search with custom weights
hits = multi.search_multi(
    queries={"text": [0.1, 0.2, ...], "image": [0.5, 0.6, ...]},
    k=5,
    weights={"text": 0.6, "image": 0.4},
)

7. Data Versioning / Snapshots

# Insert initial data
for i in range(1000):
    col.upsert(id=i, vector=[...], payload={"version": 1})

# Create a snapshot
snapshot = col.create_snapshot("v1")
print(snapshot)  # {"name": "v1", "vector_count": 1000, ...}

# Mutate data...
for i in range(1000, 2000):
    col.upsert(id=i, vector=[...])

# Roll back to v1
col.restore_snapshot("v1")
assert col.count == 1000  # back to original state

# Manage snapshots
snapshots = col.list_snapshots()
col.delete_snapshot("v1")

8. Standalone Index (No WAL)

import numpy as np
import quiver_vector_db as quiver

# In-memory HNSW index — no persistence overhead
idx = quiver.HnswIndex(dimensions=128, metric="l2", m=16, ef_construction=200)

# Batch insert from numpy array (fastest path)
vectors = np.random.randn(10_000, 128).astype(np.float32)
idx.add_batch_np(vectors)
idx.flush()

# Or parallel insert for maximum throughput
idx2 = quiver.HnswIndex(dimensions=128, metric="l2")
idx2.add_batch_parallel(vectors, num_threads=8)
idx2.flush()

# Search
results = idx.search(query=vectors[0].tolist(), k=10)

# Save and load
idx.save("my_index.qvec")
idx_loaded = quiver.HnswIndex.load("my_index.qvec")

9. REST API Server

# Start server programmatically
from quiver_vector_db.server import create_server

server = create_server(host="0.0.0.0", port=8080, data_path="./my_data")
server.serve_forever()

Or from the command line:

python -m quiver_vector_db.server --port 8080 --data ./my_data

Endpoints:

Method	Path	Description
`GET`	`/healthz`	Health check
`GET`	`/collections`	List collections
`POST`	`/collections`	Create collection (`{"name", "dimensions", "metric"}`)
`DELETE`	`/collections/{name}`	Delete collection
`POST`	`/collections/{name}/upsert`	Upsert vector (`{"id", "vector", "payload"}`)
`POST`	`/collections/{name}/upsert_batch`	Batch upsert (`{"entries": [...]}`)
`POST`	`/collections/{name}/search`	Search vectors (`{"query", "k", "filter"}`)
`POST`	`/collections/{name}/delete`	Delete vector (`{"id"}`)
`GET`	`/collections/{name}/count`	Get vector count
`POST`	`/collections/{name}/snapshots`	Create snapshot
`GET`	`/collections/{name}/snapshots`	List snapshots
`POST`	`/collections/{name}/snapshots/restore`	Restore snapshot
`DELETE`	`/collections/{name}/snapshots/{snap}`	Delete snapshot

10. BM25 Standalone

from quiver_vector_db import BM25

bm25 = BM25(k1=1.5, b=0.75)

# Index documents
bm25.index_document(0, "the quick brown fox jumps over the lazy dog")
bm25.index_document(1, "machine learning with neural networks")
bm25.index_document(2, "the fox and the hound")

# Generate sparse query vector
sparse_query = bm25.encode_query("quick fox")
print(sparse_query)  # {dim_id: idf_weight, ...}

# Save and load
bm25.save("bm25_state.json")
bm25_loaded = BM25.load("bm25_state.json")

Index Types

Eight index types, all usable via Client or standalone:

Index	Type	Recall	RAM	Best For
`hnsw`	`HnswIndex`	95-99%	Vectors + graph	General purpose (default)
`flat`	`FlatIndex`	100%	All vectors (f32)	Small datasets, exact results
`quantized_flat`	`QuantizedFlatIndex`	~99%	~4x less (int8)	Memory-constrained exact search
`fp16_flat`	`Fp16FlatIndex`	>99.5%	~2x less (float16)	Balanced memory vs accuracy
`ivf`	`IvfIndex`	Tunable	Vectors + centroids	Large datasets
`ivf_pq`	`IvfPqIndex`	~90%+	~96x less (PQ codes)	Million-scale, extreme compression
`mmap_flat`	`MmapFlatIndex`	100%	Near-zero RSS	Dataset larger than RAM
`binary_flat`	`BinaryFlatIndex`	Low	~32x less (1-bit)	Candidate generation, re-ranking

import quiver_vector_db as quiver

# Via Client (WAL-persisted)
db = quiver.Client(path="./data")
col = db.create_collection("docs", dimensions=768, metric="cosine", index_type="hnsw")

# Standalone in-memory
idx = quiver.FlatIndex(dimensions=384, metric="cosine")
idx = quiver.HnswIndex(dimensions=384, metric="cosine", ef_construction=200, ef_search=50, m=16)
idx = quiver.QuantizedFlatIndex(dimensions=384, metric="cosine")
idx = quiver.Fp16FlatIndex(dimensions=384, metric="cosine")
idx = quiver.IvfIndex(dimensions=384, metric="l2", n_lists=256, nprobe=16, train_size=4096)
idx = quiver.IvfPqIndex(dimensions=384, metric="l2", n_lists=256, nprobe=16, pq_m=8, pq_k_sub=256)
idx = quiver.MmapFlatIndex(dimensions=384, metric="cosine", path="./vectors.qvec")
idx = quiver.BinaryFlatIndex(dimensions=384, metric="l2")

Distance Metrics

Metric	String	Use When
Cosine	`"cosine"`	Text/image embeddings (most common)
L2 (Euclidean)	`"l2"`	Geometry, sensor data
Dot Product	`"dot_product"`	Pre-normalized vectors

All metrics use SIMD-accelerated kernels (AVX2+FMA on x86, NEON on ARM).

API Reference

`Client(path="./data")`

Persistent vector database client. Opens a directory on disk; all writes are WAL-backed.

Method	Description
`create_collection(name, dimensions, metric, index_type)`	Create a new collection
`get_collection(name)`	Get an existing collection
`get_or_create_collection(name, dimensions, metric)`	Get or create
`delete_collection(name)`	Delete collection and data
`list_collections()`	List collection names

`Collection`

Method	Description
`upsert(id, vector, payload=None)`	Insert or update a vector
`upsert_batch(entries)`	Batch insert `[(id, vector, payload?)]`
`search(query, k, filter=None)`	Search k nearest. Returns `[{"id", "distance", "payload"}]`
`upsert_hybrid(id, vector, sparse_vector, payload)`	Upsert with sparse vector
`search_hybrid(dense_query, sparse_query, k, ...)`	Weighted dense+sparse search
`delete(id)`	Delete by ID
`create_snapshot(name)`	Snapshot current state
`restore_snapshot(name)`	Restore to snapshot
`list_snapshots()`	List all snapshots
`delete_snapshot(name)`	Delete a snapshot
`count`	Number of vectors

`TextCollection(collection, embedding_function)`

Method	Description
`add(ids, documents, payloads=None)`	Embed and index documents
`query(text, k, mode="hybrid")`	Search by text. Modes: `"hybrid"`, `"semantic"`, `"keyword"`
`delete(ids)`	Delete documents

Embedding Functions

Class	Provider	Install
`SentenceTransformerEmbedding(model)`	Local models	`pip install quiver-vector-db[sentence-transformers]`
`OpenAIEmbedding(model, api_key)`	OpenAI API	`pip install quiver-vector-db[openai]`

Custom embedder:

class MyEmbedder:
    def __call__(self, texts: list[str]) -> list[list[float]]:
        return [my_model.encode(t) for t in texts]

    @property
    def dimensions(self) -> int:
        return 384

IDE Support

Quiver ships with py.typed and .pyi type stubs. Autocompletion, type checking, and inline docs work out of the box in VSCode, PyCharm, and any editor supporting PEP 561.

Development

Prerequisites

Rust toolchain (rustup, cargo)
Python 3.11+

Setup

git clone https://github.com/rhshriva/Quiver.git && cd Quiver
python3 -m venv .venv && source .venv/bin/activate
pip install maturin pytest numpy
maturin develop --release

Running Tests

# Rust tests (~190 tests)
cargo test --workspace

# Python functional tests (~170 tests)
pytest tests/ -v --ignore=tests/test_perf.py --ignore=tests/test_benchmark.py --ignore=tests/test_perf_regression.py

# Performance benchmarks with apple-to-apple comparisons
pytest tests/test_perf_regression.py -v -s

License

Apache 2.0 — see LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.4

Mar 15, 2026

0.1.3

Mar 15, 2026

0.1.2

Mar 14, 2026

This version

0.1.1

Mar 14, 2026

0.1.0

Mar 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quiver_vector_db-0.1.1.tar.gz (5.7 MB view details)

Uploaded Mar 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

quiver_vector_db-0.1.1-cp311-abi3-macosx_11_0_arm64.whl (572.6 kB view details)

Uploaded Mar 14, 2026 CPython 3.11+macOS 11.0+ ARM64

File details

Details for the file quiver_vector_db-0.1.1.tar.gz.

File metadata

Download URL: quiver_vector_db-0.1.1.tar.gz
Upload date: Mar 14, 2026
Size: 5.7 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.12.6

File hashes

Hashes for quiver_vector_db-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`4a806ae0d1e65f8022a1a743ea2d8771d8e7d78db4e67bab2cd8ace4f497c6e4`
MD5	`3749f9a9f69b763e4ba19eb90cfe4a19`
BLAKE2b-256	`aa5bc54ee7a2dd3f3903cb08a7e6e7a8097d1dfe35f6eb1588e75ca15fe2678f`

See more details on using hashes here.

File details

Details for the file quiver_vector_db-0.1.1-cp311-abi3-macosx_11_0_arm64.whl.

File metadata

Download URL: quiver_vector_db-0.1.1-cp311-abi3-macosx_11_0_arm64.whl
Upload date: Mar 14, 2026
Size: 572.6 kB
Tags: CPython 3.11+, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.12.6

File hashes

Hashes for quiver_vector_db-0.1.1-cp311-abi3-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`2d276e2fdf79e42b9340c08a82a207d6edcdef3eb6aeef2f79e8abf577e4b0e9`
MD5	`30aac2bdc695e73d850e81b7b843c97e`
BLAKE2b-256	`bc4380457df5e764afbaabc5fabedf70d6d9f37c1bada2efc96f08abaf0c7a70`

See more details on using hashes here.

quiver-vector-db 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Quiver

Installation

Features

Performance

Insert Throughput, Search Latency & Recall — All Index Types

Quick Start

1. Persistent Collection (WAL-backed)

2. Text Search with Built-in Embeddings

3. Filtered Search with Metadata

4. Hybrid Dense + Sparse Search

5. Batch Upsert

6. Multi-Vector / Multi-Modal Search

7. Data Versioning / Snapshots

8. Standalone Index (No WAL)

9. REST API Server

10. BM25 Standalone

Index Types

Distance Metrics

API Reference

Client(path="./data")

Collection

TextCollection(collection, embedding_function)

Embedding Functions

IDE Support

Development

Prerequisites

Setup

Running Tests

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`Client(path="./data")`

`Collection`

`TextCollection(collection, embedding_function)`