Dead-simple local vector database powered by usearch HNSW.
Project description
SimpleVecDB
The dead-simple, local-first vector database.
SimpleVecDB brings Chroma-like simplicity to a single SQLite file. Built on usearch HNSW indexing, it offers high-performance vector search, quantization, and zero infrastructure headaches. Perfect for local RAG, offline agents, and indie hackers who need production-grade vector search without the operational overhead.
Why SimpleVecDB?
- Zero Infrastructure — Just a
.dbfile. No Docker, no Redis, no cloud bills. - Blazing Fast — 10-100x faster search via usearch HNSW. Adaptive: brute-force for <10k vectors (perfect recall), HNSW for larger collections.
- Truly Portable — Runs anywhere SQLite runs: Linux, macOS, Windows, even WASM.
- Async Ready — Full async/await support with optional executor injection for thread-safe ONNX/usearch sharing.
- Batteries Included — Optional FastAPI embeddings server + LangChain/LlamaIndex integrations via
[integrations]extra. - Production Ready — Hybrid search (BM25 + vector), metadata filtering, multi-collection support, and automatic hardware acceleration.
When to Choose SimpleVecDB
| Use Case | SimpleVecDB | Cloud Vector DB |
|---|---|---|
| Local RAG applications | ✅ Perfect fit | ❌ Overkill + latency |
| Offline-first agents | ✅ No internet needed | ❌ Requires connectivity |
| Prototyping & MVPs | ✅ Zero config | ⚠️ Setup overhead |
| Multi-tenant SaaS at scale | ⚠️ Consider sharding | ✅ Built for this |
| Budget-conscious projects | ✅ $0/month | ❌ $50-500+/month |
Prerequisites
System Requirements:
- Python 3.10+
- SQLite 3.35+ with FTS5 support (included in Python 3.8+ standard library)
- 50MB+ disk space for core library, 500MB+ with
[server]extras
Optional for GPU Acceleration:
- CUDA 11.8+ for NVIDIA GPUs
- Metal Performance Shaders (MPS) for Apple Silicon
Note: If using custom-compiled SQLite, ensure
-DSQLITE_ENABLE_FTS5is enabled for full-text search support.
Installation
# Standard installation (includes clustering, encryption)
pip install simplevecdb
# With LangChain & LlamaIndex integrations
pip install "simplevecdb[integrations]"
# With local embeddings server (adds 500MB+ models)
pip install "simplevecdb[server]"
What's included by default:
- Vector search with HNSW indexing
- Clustering (K-means, MiniBatch K-means, HDBSCAN)
- Encryption (SQLCipher AES-256)
- Async support
Verify Installation:
python -c "from simplevecdb import VectorDB; print('SimpleVecDB installed successfully!')"
Quickstart
SimpleVecDB is just a storage and search layer — it doesn't ship an LLM and won't generate embeddings for you. Bring whichever embedding source you already use; three common ones below.
Option 1: OpenAI embeddings
from simplevecdb import VectorDB
from openai import OpenAI
client = OpenAI()
db = VectorDB("notes.db")
notes = db.collection("personal")
def embed(text: str) -> list[float]:
return (
client.embeddings
.create(model="text-embedding-3-small", input=text)
.data[0].embedding
)
entries = [
("Cherry MX silent reds bottom out around 45g — quieter than browns", "keyboards"),
("Sourdough hydration sweet spot is ~75% with this flour", "baking"),
("EXPLAIN ANALYZE showed seq scan; ANALYZE on the table fixed it", "work"),
("Passport renewal took 3 weeks, not the advertised 6–8", "admin"),
]
notes.add_texts(
texts=[t for t, _ in entries],
embeddings=[embed(t) for t, _ in entries],
metadatas=[{"tag": tag} for _, tag in entries],
)
hits = notes.similarity_search(embed("how loud are silent reds"), k=2)
for doc, score in hits:
print(f"{score:.3f} {doc.page_content}")
work = notes.similarity_search(
embed("query plan slow"),
k=5,
filter={"tag": "work"},
)
Option 2: Fully local (no network, no API key)
pip install "simplevecdb[server]"
from simplevecdb import VectorDB
from simplevecdb.embeddings.models import embed_texts
db = VectorDB("notes.db")
notes = db.collection("personal")
texts = [
"Cherry MX silent reds bottom out around 45g",
"Sourdough hydration sweet spot is ~75% with this flour",
"EXPLAIN ANALYZE showed seq scan; ANALYZE on the table fixed it",
]
notes.add_texts(texts=texts, embeddings=embed_texts(texts))
vec = notes.similarity_search(embed_texts(["quieter switches"])[0], k=2)
mixed = notes.hybrid_search("postgres slow query", k=3)
If you'd rather hit an HTTP endpoint than import the embedding models directly, the bundled server speaks the same shape as OpenAI's embeddings API:
simplevecdb-server --port 8000 # default model, auto warm-up
simplevecdb-server --host 0.0.0.0 --port 9000
simplevecdb-server --no-warmup # skip the model preload
simplevecdb-server --help
Server tuning (model registry, rate limits, API keys, CORS, CUDA) lives in the Setup Guide.
Option 3: LangChain or LlamaIndex
Already wired into one of the big RAG frameworks? Drop SimpleVecDB in as the vector store:
pip install "simplevecdb[integrations]"
from simplevecdb.integrations.langchain import SimpleVecDBVectorStore
from langchain_openai import OpenAIEmbeddings
store = SimpleVecDBVectorStore(
db_path="notes.db",
embedding=OpenAIEmbeddings(model="text-embedding-3-small"),
)
store.add_texts([
"Cherry MX silent reds bottom out around 45g",
"EXPLAIN ANALYZE showed seq scan; ANALYZE on the table fixed it",
])
store.similarity_search("quieter switches", k=1)
store.hybrid_search("postgres performance", k=3)
LlamaIndex is the same shape:
from simplevecdb.integrations.llamaindex import SimpleVecDBLlamaStore
from llama_index.embeddings.openai import OpenAIEmbedding
store = SimpleVecDBLlamaStore(
db_path="notes.db",
embedding=OpenAIEmbedding(model="text-embedding-3-small"),
)
End-to-end notebooks (including a fully local Ollama RAG) live in the examples gallery.
Feature Highlights
A few of the things SimpleVecDB does well — see
docs/Features.md for the comprehensive list.
- Vector + keyword + hybrid search — cosine / L2 similarity, BM25 via SQLite FTS5, and Reciprocal Rank Fusion in one collection.
- Adaptive HNSW — brute-force for <10k vectors (perfect recall),
usearchHNSW above that. Override per query withexact=True/False. - Quantization —
FLOAT32,FLOAT16,INT8,BITfor 1×–32× compression. - Multi-collection + cross-collection search — isolated namespaces in
one
.dbfile, with merged ranked search across them. - Mongo-style filters —
$eq $ne $gt $gte $lt $lte $in $nin $exists $betweenon metadata, edges, and events. - Memory primitives (v2.6.1) — pending-vector buffer with atomic flush, weighted directed edges, append-only event feed, TTL with delete/callback sweep, and a threshold-driven rebuild scheduler.
- Atomic counters & transactions (v2.6.1) —
increment_metadatafor JSON deltas in one statement; SAVEPOINT-backeddb.transaction()/collection.tx()rolling all catalog writes back on error. - Async, encryption, clustering, hierarchies — full async surface (with executor injection), SQLCipher AES-256, K-means / MiniBatch K-means / HDBSCAN, parent/child relationships.
- Framework integrations — drop-in
LangChainandLlamaIndexadapters via the[integrations]extra; optional FastAPI embeddings server via[server].
For full method-level coverage, see the Features doc or the API reference.
Performance Benchmarks
10,000 vectors, 384 dimensions, k=10 search — Full benchmarks →
| Quantization | Storage | Query Time | Compression |
|---|---|---|---|
| FLOAT32 | 36.0 MB | 0.20 ms | 1x |
| FLOAT16 | 28.7 MB | 0.20 ms | 2x |
| INT8 | 25.0 MB | 0.16 ms | 4x |
| BIT | 21.8 MB | 0.08 ms | 32x |
Key highlights:
- 3-34x faster than brute-force for collections >10k vectors
- Adaptive search: perfect recall for small collections, HNSW for large
- FLOAT16 recommended: best balance of speed, memory, and precision
Documentation
- Features — Comprehensive list of every capability, grouped by area
- Setup Guide — Environment variables, server configuration, authentication
- API Reference — Complete class/method documentation with type signatures
- Benchmarks — Quantization strategies, batch sizes, hardware optimization
- Integration Examples — RAG notebooks, Ollama workflows, production patterns
- Contributing Guide — Development setup, testing, PR guidelines
Troubleshooting
Import Error: sqlite3.OperationalError: no such module: fts5
# Your Python's SQLite was compiled without FTS5
# Solution: Install Python from python.org (includes FTS5) or compile SQLite with:
# -DSQLITE_ENABLE_FTS5
Dimension Mismatch Error
# Ensure all vectors in a collection have identical dimensions
collection = db.collection("docs", dim=384) # Explicit dimension
CUDA Not Detected (GPU Available)
# Verify CUDA installation
python -c "import torch; print(torch.cuda.is_available())"
# Reinstall PyTorch with CUDA support
pip install torch --index-url https://download.pytorch.org/whl/cu118
Slow Queries on Large Datasets
- Enable quantization:
collection = db.collection("docs", quantization=Quantization.INT8) - For >10k vectors, HNSW is automatic; tune with
rebuild_index(connectivity=32) - Use
exact=Falseto force HNSW even on smaller collections - Use metadata filtering to reduce search space
Roadmap
What's on the near-term radar:
- Incremental clustering (online learning)
- Cluster visualization exports
For shipped capabilities, see docs/Features.md and the
release-by-release Changelog. Vote on these or propose new
ideas in GitHub Discussions.
Contributing
Contributions are welcome! Whether you're fixing bugs, improving documentation, or proposing new features:
- Read CONTRIBUTING.md for development setup
- Check existing Issues and Discussions
- Open a PR with clear description and tests
Community & Support
Get Help:
- GitHub Discussions — Q&A and feature requests
- GitHub Issues — Bug reports
Stay Updated:
- GitHub Releases — Changelog and updates
- Examples Gallery — Community-contributed notebooks
Other Ways to Support
- ☕ Buy me a coffee - One-time donation
- ⭐ Star the repo - Helps with visibility
- 🐛 Report bugs - Improve the project for everyone
- 📝 Contribute - See CONTRIBUTING.md
License
MIT License — Free for personal and commercial use.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file simplevecdb-2.6.1.tar.gz.
File metadata
- Download URL: simplevecdb-2.6.1.tar.gz
- Upload date:
- Size: 628.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.12 {"installer":{"name":"uv","version":"0.11.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a0dbba5454dd417e31649955ceea32859371dea33483aed83949ecb4961103e2
|
|
| MD5 |
3c66500a4d9b23bc3c9edc0d886073a3
|
|
| BLAKE2b-256 |
400f3b3964c867f12cdec37df99c9d3634c7cf611e87c4ff4813e31f816800a7
|
File details
Details for the file simplevecdb-2.6.1-py3-none-any.whl.
File metadata
- Download URL: simplevecdb-2.6.1-py3-none-any.whl
- Upload date:
- Size: 114.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.12 {"installer":{"name":"uv","version":"0.11.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
95bcb5eb6878971837f68046c1dfd854a76f81ecc4d3f5f62358d95bfb1cd832
|
|
| MD5 |
03619742b60d0be3aecb528621eb0f02
|
|
| BLAKE2b-256 |
71fef7b563eab5b4187cf56435cb65847c644391b06d35a8b95edf3fd5bd347d
|