A lightweight, high-performance vector database with a Rust backend. Hybrid search, embedded + server modes, zero dependencies.

These details have not been verified by PyPI

Project links

Project description

vxdb

The vector database that fits in your pocket.

Rust-powered. Python-native. One pip install away.

pip install vxdb

import vxdb

db = vxdb.Database(path="./my_data")  # persistent — data survives restarts
collection = db.create_collection("docs", dimension=384)

embed = your_embedding_function  # OpenAI, Sentence Transformers, Cohere, etc.

collection.upsert(
    ids=["a", "b"],
    vectors=[embed("how to train a model"), embed("best pasta recipe")],
    documents=["how to train a model", "best pasta recipe"],
)

collection.query(vector=embed("machine learning"), top_k=5)

embed() is any function that turns text into vectors — see examples/ for OpenAI, Sentence Transformers, LangChain, and Cohere.

That's it. No Docker. No config files. No cloud account. No 500 MB of dependencies.

Why developers choose vxdb

Stupid fast

The entire hot path — distance computation, HNSW traversal, BM25 scoring, mmap I/O — is pure Rust with zero GIL contention. Your Python code calls directly into compiled native code via PyO3. No serialization overhead. No REST round-trips. No subprocess.

Stupid light

A single native wheel under 5 MB with zero Python dependencies. Starts in under 10 ms. No numpy. No scipy. No protobuf. No grpcio version conflicts. Just pip install vxdb and you're done.

Runs anywhere

Laptop. CI pipeline. Raspberry Pi. AWS Lambda. Docker container. Air-gapped server. Anywhere Python runs, vxdb runs. No infrastructure required to get started — scale up to a standalone server when you need it.

Hybrid search built-in

Vector similarity + BM25 keyword matching fused via Reciprocal Rank Fusion. One API call. Tunable alpha parameter. No separate search engine needed. No Elasticsearch sidecar.

Other databases like Qdrant, Milvus, and Zvec support hybrid search too — but they require you to run a separate sparse encoder (BM25 or SPLADE) yourself and pass pre-computed sparse vectors. vxdb computes BM25 internally from the documents you already upserted. One call: hybrid_query(vector=..., query="text", alpha=0.5). No extra step.

Dual-mode: embedded + server

Many databases now offer an "embedded" mode — but the implementations vary widely. Qdrant's local mode is a Python reimplementation (not their Rust engine). Weaviate embedded downloads a Go binary and runs it as a subprocess. Milvus Lite works but is limited to Linux/macOS and recommended for <1M vectors.

vxdb's embedded mode is the real Rust engine compiled directly into a Python extension via PyO3. No serialization. No subprocess. No network. And the same engine powers the standalone REST server — start in a notebook, scale to multi-client HTTP when you're ready. No rewrite.

The full picture

                    ┌─────────────────────────────────────────────────┐
                    │               Your Python Code                  │
                    └─────────────┬───────────────────┬───────────────┘
                                  │                   │
                    ┌─────────────▼──────┐  ┌────────▼────────────┐
                    │  Embedded (PyO3)   │  │  Server (REST API)  │
                    │  In-process,       │  │  Axum, async,       │
                    │  no serialize,     │  │  multi-client       │
                    │  <1µs overhead     │  │                     │
                    └─────────────┬──────┘  └────────┬────────────┘
                                  │                   │
                    ┌─────────────▼───────────────────▼───────────────┐
                    │              Rust Core Engine                    │
                    │                                                  │
                    │  ┌──────────┐ ┌──────────┐ ┌─────────────────┐  │
                    │  │   HNSW   │ │   Flat   │ │  BM25 Keyword   │  │
                    │  │  Index   │ │  Index   │ │     Index       │  │
                    │  └──────────┘ └──────────┘ └─────────────────┘  │
                    │  ┌──────────────────┐ ┌──────────────────────┐  │
                    │  │ Distance Metrics  │ │  Metadata Filtering  │  │
                    │  │ cosine/L2/dot     │ │  10 operators, SQL   │  │
                    │  └──────────────────┘ └──────────────────────┘  │
                    │  ┌──────────────────────────────────────────┐   │
                    │  │   Hybrid Search (Reciprocal Rank Fusion)  │   │
                    │  └──────────────────────────────────────────┘   │
                    └─────────────────────┬───────────────────────────┘
                                          │
                    ┌─────────────────────▼───────────────────────────┐
                    │                  Storage                        │
                    │  mmap vectors │ SQLite metadata │ Write-Ahead Log│
                    └─────────────────────────────────────────────────┘

Quick Start

3 lines to your first search

import vxdb

# Persistent (data survives restarts)
db = vxdb.Database(path="./my_data")

# Or in-memory (ephemeral, great for prototyping)
# db = vxdb.Database()

collection = db.create_collection("docs", dimension=384, metric="cosine")

Insert vectors

collection.upsert(
    ids=["a", "b", "c"],
    vectors=[[0.1, 0.2, ...], [0.3, 0.4, ...], [0.5, 0.6, ...]],
    metadata=[{"type": "article"}, {"type": "blog"}, {"type": "article"}],
    documents=["intro to ML", "my favorite recipes", "deep learning guide"],
)

Search — four ways

# 1. Vector similarity
results = collection.query(vector=[0.1, 0.2, ...], top_k=5)

# 2. Filtered (metadata constraints)
results = collection.query(
    vector=[0.1, ...], top_k=5,
    filter={"type": {"$eq": "article"}}
)

# 3. Hybrid (vector + keyword — the sweet spot)
results = collection.hybrid_query(
    vector=[0.1, ...],
    query="machine learning",
    top_k=5,
    alpha=0.5,  # 0=keyword only, 1=vector only
)

# 4. Keyword only (BM25)
results = collection.keyword_search(query="machine learning", top_k=5)

Every result returns {"id", "score", "metadata", "document"}.

Installation

pip install vxdb

That's the whole thing. Works on macOS, Linux, Windows. Python 3.9+.

For the HTTP client (talking to a remote vxdb server):

pip install 'vxdb[server]'

Embedding Providers

vxdb stores pre-computed vectors — bring any embedding model you want. We have step-by-step notebooks for each:

Provider	Install	API Key?	Notebook
OpenAI	`pip install openai`	Yes	examples/openai_embeddings.ipynb
Sentence Transformers	`pip install sentence-transformers`	No (local)	examples/sentence_transformers.ipynb
LangChain (any provider)	`pip install langchain-openai`	Depends	examples/langchain_integration.ipynb
Cohere	`pip install cohere`	Yes	examples/cohere_embeddings.ipynb
Ollama (local LLMs)	`pip install ollama`	No (local)	—

Or use the pluggable interface:

from vxdb.embedding import EmbeddingFunction

class MyEmbedder(EmbeddingFunction):
    def embed(self, texts: list[str]) -> list[list[float]]:
        return your_model.encode(texts)

Server Mode

Same engine, accessed over HTTP. Deploy it as a standalone service.

# Start the server
vxdb-server --host 0.0.0.0 --port 8080

Python client:

from vxdb import Client

client = Client("http://localhost:8080")
coll = client.create_collection("docs", dimension=384)
coll.upsert(ids=["a"], vectors=[[0.1, ...]], documents=["hello world"])
results = coll.hybrid_query(vector=[0.1, ...], query="hello", top_k=5)

cURL:

# Create collection
curl -X POST localhost:8080/collections \
  -H "Content-Type: application/json" \
  -d '{"name": "docs", "dimension": 384}'

# Upsert
curl -X POST localhost:8080/collections/docs/upsert \
  -H "Content-Type: application/json" \
  -d '{"ids": ["a"], "vectors": [[0.1, 0.2]], "documents": ["hello world"]}'

# Query
curl -X POST localhost:8080/collections/docs/query \
  -H "Content-Type: application/json" \
  -d '{"vector": [0.1, 0.2], "top_k": 5}'

Docker:

docker build -t vxdb .
docker run -p 8080:8080 vxdb    # ~145 MB Debian-based image

Hybrid Search

Most vector databases give you vector search OR keyword search. vxdb gives you both, fused intelligently in a single call.

How it works:

You upsert with documents — raw text is tokenized into a built-in BM25 index alongside your vectors
At query time — vector search and BM25 run in parallel, then Reciprocal Rank Fusion merges both ranked lists
You control the blend — alpha=1.0 (pure vector) → alpha=0.5 (balanced) → alpha=0.0 (pure keyword)

When to use it: Specific product names. Error codes. Proper nouns. Anything where exact terms matter alongside semantic meaning. See examples/hybrid_search.ipynb for a deep dive with side-by-side comparisons.

results = collection.hybrid_query(
    vector=embed("lightweight laptop for students"),
    query="MacBook Air M4",
    top_k=5,
    alpha=0.5,
)

How vxdb compares

	vxdb	Zvec (Alibaba)	ChromaDB	Qdrant	Pinecone	Milvus	Weaviate	FAISS
Language	Rust	C++ (Proxima)	Rust (v1.0+)	Rust	Proprietary	Go/C++	Go	C++
Embedded mode	PyO3, true in-process	In-process	In-process	Python-only local mode	No	Milvus Lite	Subprocess (downloads Go binary)	SWIG bindings
Server mode	Yes	No	Yes	Yes	Cloud only	Yes	Yes	No
`pip install` just works	Yes	Yes	Yes	Yes (local mode)	N/A (SaaS)	Yes (Milvus Lite)	Yes (Linux/macOS)	Yes
Python dependencies	None (zero)	DashText SDK	Several	numpy, grpcio, etc.	N/A	grpcio, protobuf, etc.	grpcio, etc.	numpy
Wheel size	~5 MB	~30 MB	~20 MB	~50 MB	N/A	~50 MB+	~100 MB+ (downloads binary)	~20 MB
Startup time	<10 ms	<100 ms	<500 ms	~1-3 s (server)	N/A	~5-10 s (server)	~3-5 s (server)	<10 ms
Hybrid search	Built-in BM25 + RRF	BM25 + RRF + weighted	RRF (dense+sparse)	RRF, DBSF	Sparse+dense	Sparse vectors	BM25 + RRF	No
BM25 without external encoder	Yes (automatic)	Requires DashText SDK	No	Requires sparse encoder	No	Requires sparse encoder	Yes	No
Sparse vectors	No	Yes	Yes	Yes	Yes	Yes	No	No
Multi-vector queries	No	Yes	No	Yes	No	No	No	No
Metadata filtering	10 operators	Structured filters	Yes	Yes	Yes	Yes	Yes	No
Persistence	mmap + SQLite + WAL	Custom engine	SQLite	RocksDB	Cloud	RocksDB	LSM	Manual
Crash recovery	WAL	Yes	Yes (v1.0)	Yes	Yes	Yes	Yes	No
Quantization	No (planned)	int8, RabitQ	No	Scalar/PQ	Yes	Yes	PQ/BQ	PQ/SQ
Docker image	~145 MB	N/A (no server)	~200 MB+	~100 MB	No	~1 GB+	~300 MB+	No
Runs offline	Yes	Yes	Yes	Yes	No	Yes	Yes	Yes
License	Apache 2.0	Apache 2.0	Apache 2.0	Apache 2.0	Proprietary	Apache 2.0	BSD-3	MIT

API Reference

Python (Embedded)

# Database
db = vxdb.Database()                  # in-memory (ephemeral)
db = vxdb.Database(path="./my_data")  # persistent (data survives restarts)
db.create_collection(name, dimension, metric="cosine", index="flat")
db.get_collection(name)
db.list_collections()
db.delete_collection(name)

# Collection
collection.upsert(ids, vectors, metadata=None, documents=None)
collection.query(vector, top_k=10, filter=None)
collection.hybrid_query(vector, query, top_k=10, alpha=0.5)
collection.keyword_search(query, top_k=10)
collection.delete(ids)
collection.count()

REST API

Method	Endpoint	Description
`POST`	`/collections`	Create collection
`GET`	`/collections`	List collections
`DELETE`	`/collections/{name}`	Delete collection
`POST`	`/collections/{name}/upsert`	Upsert vectors (+ optional documents)
`POST`	`/collections/{name}/query`	Vector search (+ optional filter)
`POST`	`/collections/{name}/hybrid`	Hybrid vector + keyword search
`POST`	`/collections/{name}/keyword`	BM25 keyword search
`POST`	`/collections/{name}/delete`	Delete vectors by ID
`GET`	`/collections/{name}/count`	Count vectors

Parameters

Parameter	Values	Default
`metric`	`"cosine"`, `"euclidean"`, `"dot"`	`"cosine"`
`index`	`"flat"` (exact), `"hnsw"` (approximate)	`"flat"`
`filter`	`$eq` `$ne` `$gt` `$gte` `$lt` `$lte` `$in` `$nin` `$and` `$or`	—
`alpha`	`0.0` (keyword) to `1.0` (vector)	`0.5`

Examples

Interactive Jupyter notebooks with step-by-step walkthroughs:

Notebook	What you'll build
quickstart.ipynb	Every feature in 5 min (no API keys)
openai_embeddings.ipynb	Semantic search with OpenAI embeddings
sentence_transformers.ipynb	Free, local embeddings (no API key)
langchain_integration.ipynb	LangChain + RAG pipeline
cohere_embeddings.ipynb	Multilingual search with Cohere
hybrid_search.ipynb	Deep dive: vector vs keyword vs hybrid

Development

git clone https://github.com/getmykhan/vxdb.git && cd vxdb

# Rust
cargo build --all
cargo test --all        # 120+ tests

# Python
uv venv .venv && source .venv/bin/activate
uv pip install maturin pytest httpx
maturin develop
PYTHONPATH=python pytest tests/ -v

The codebase is a Cargo workspace:

vxdb/
├── crates/
│   ├── vxdb-core/       # Engine: indexes, distance, storage, hybrid search
│   ├── vxdb-python/     # PyO3 bindings
│   └── vxdb-server/     # Axum REST API server
├── python/vxdb/         # Python package (client SDK, embedding interface)
├── examples/             # Jupyter notebooks
└── tests/                # Python integration tests

Roadmap

~~Persistent collections (mmap + SQLite + WAL)~~ Done
SIMD-accelerated distance computation
Quantization (int8/binary) for reduced memory
GPU acceleration (CUDA/Metal)
HNSW graph serialization (fast restart for large indexes)
Streaming upsert for large datasets
Sparse vector support
gRPC API
Official LangChain VectorStore integration
Kubernetes Helm chart
Benchmarks suite vs Qdrant, ChromaDB, Zvec, FAISS

License

Apache 2.0

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jun 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vxdb-0.1.0.tar.gz (55.4 kB view details)

Uploaded Jun 24, 2026 Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vxdb-0.1.0-cp39-abi3-win_amd64.whl (1.3 MB view details)

Uploaded Jun 24, 2026 CPython 3.9+Windows x86-64

vxdb-0.1.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.6 MB view details)

Uploaded Jun 24, 2026 CPython 3.9+manylinux: glibc 2.17+ x86-64

vxdb-0.1.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.4 MB view details)

Uploaded Jun 24, 2026 CPython 3.9+manylinux: glibc 2.17+ ARM64

vxdb-0.1.0-cp39-abi3-macosx_11_0_arm64.whl (1.4 MB view details)

Uploaded Jun 24, 2026 CPython 3.9+macOS 11.0+ ARM64

vxdb-0.1.0-cp39-abi3-macosx_10_12_x86_64.whl (1.5 MB view details)

Uploaded Jun 24, 2026 CPython 3.9+macOS 10.12+ x86-64

File details

Details for the file vxdb-0.1.0.tar.gz.

File metadata

Download URL: vxdb-0.1.0.tar.gz
Upload date: Jun 24, 2026
Size: 55.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.14.1

File hashes

Hashes for vxdb-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`0f965311d8aaab2c05d8a6563d323698ba7db88f61cc573e3baaa2fe1df31ee6`
MD5	`eefeb8f801746a8c8071b34ec5e8c0ce`
BLAKE2b-256	`27f9bac143b105c1fec0b8afa4778c8c731e7d22ee152d135382b0fb6eee8ee1`

See more details on using hashes here.

File details

Details for the file vxdb-0.1.0-cp39-abi3-win_amd64.whl.

File metadata

Download URL: vxdb-0.1.0-cp39-abi3-win_amd64.whl
Upload date: Jun 24, 2026
Size: 1.3 MB
Tags: CPython 3.9+, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.14.1

File hashes

Hashes for vxdb-0.1.0-cp39-abi3-win_amd64.whl
Algorithm	Hash digest
SHA256	`2eb28187fb89956c6fa0ce42add504fcc93e474140d8c0141ff383a1f4522c3c`
MD5	`99b07b3dbef8f9f172b59416b5b24a12`
BLAKE2b-256	`c3ad80a577ddbd0dc4094516d840c38dbb12d8fb398175c5f01ddf5d068aaacd`

See more details on using hashes here.

File details

Details for the file vxdb-0.1.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: vxdb-0.1.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Jun 24, 2026
Size: 1.6 MB
Tags: CPython 3.9+, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.14.1

File hashes

Hashes for vxdb-0.1.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`06e0a350bf2bb9e79bac61d5c025f7c2b5afc5233458338c10ace597fae190df`
MD5	`4fc7166aaa023efad0700b0f6771e4a7`
BLAKE2b-256	`247a01e16cdaaf88ff63c19a43982d860878d0f4f3a6936cb41a316915dd14b8`

See more details on using hashes here.

File details

Details for the file vxdb-0.1.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

Download URL: vxdb-0.1.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Upload date: Jun 24, 2026
Size: 1.4 MB
Tags: CPython 3.9+, manylinux: glibc 2.17+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.14.1

File hashes

Hashes for vxdb-0.1.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm	Hash digest
SHA256	`4dd7f90e9e9d011c57d87c6c96c7a183b6a797fd8e29ac4aaa5774b826130a3c`
MD5	`50e11b50dd68622cf4fb9dbe3cbccb7c`
BLAKE2b-256	`0b9c1da5be18c686a3e13a6a5b3565f958612898f7cba942f92fcf69437f6104`

See more details on using hashes here.

File details

Details for the file vxdb-0.1.0-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

Download URL: vxdb-0.1.0-cp39-abi3-macosx_11_0_arm64.whl
Upload date: Jun 24, 2026
Size: 1.4 MB
Tags: CPython 3.9+, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.14.1

File hashes

Hashes for vxdb-0.1.0-cp39-abi3-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`4d8b637fe9609ae911de13d667ca2ea1435f9664a311794b0a116b07e7fdcc80`
MD5	`f050c871754bd7f9db3de23d6962816e`
BLAKE2b-256	`2ec57486f6e4d44cec51716247df5237486ffb962a408de4cbdfcefbba2ed279`

See more details on using hashes here.

File details

Details for the file vxdb-0.1.0-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

Download URL: vxdb-0.1.0-cp39-abi3-macosx_10_12_x86_64.whl
Upload date: Jun 24, 2026
Size: 1.5 MB
Tags: CPython 3.9+, macOS 10.12+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.14.1

File hashes

Hashes for vxdb-0.1.0-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm	Hash digest
SHA256	`1eba73d759d1ff8ca46803eda6c02a59393d929bd367267d7d78003671805ba5`
MD5	`e144c3dbe2d934a97d073b599c6702f4`
BLAKE2b-256	`72f04deae1cd11d7fa7de12d779437df0e4af378839e89165bc2329aa446e6ee`

See more details on using hashes here.

vxdb 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

vxdb

Why developers choose vxdb

Stupid fast

Stupid light

Runs anywhere

Hybrid search built-in

Dual-mode: embedded + server

The full picture

Quick Start

3 lines to your first search

Insert vectors

Search — four ways

Installation

Embedding Providers

Server Mode

Hybrid Search

How vxdb compares

API Reference

Python (Embedded)

REST API

Parameters

Examples

Development

Roadmap

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes