A high-performance Python vector database for semantic search and RAG applications
Project description
FastPyDB
A high-performance Python vector database with a simple, ChromaDB-like API. Features HNSW indexing, multiple embedding providers, quantization, knowledge graphs, and more.
Installation
# Install with pip (from source)
pip install -e .
# With local embeddings (recommended - no API key needed)
pip install -e ".[local]"
# With OpenAI embeddings
pip install -e ".[openai]"
# With all optional dependencies
pip install -e ".[all]"
Quick install core dependencies only:
pip install numpy hnswlib sentence-transformers
Quick Start
FastPyDB provides a simple, intuitive API similar to ChromaDB:
import fastpydb
# Create a client
client = fastpydb.Client()
# Create a collection (uses local embeddings by default)
collection = client.create_collection("my_documents")
# Add documents - embeddings are generated automatically
collection.add(
documents=[
"Machine learning is a subset of artificial intelligence",
"Deep learning uses neural networks with many layers",
"Natural language processing helps computers understand text"
],
ids=["ml", "dl", "nlp"],
metadatas=[
{"category": "AI", "level": "beginner"},
{"category": "AI", "level": "advanced"},
{"category": "NLP", "level": "intermediate"}
]
)
# Search with natural language
results = collection.query(
query_texts="What is AI?",
n_results=3
)
# Print results
for doc, score in zip(results.documents[0], results.distances[0]):
print(f"Score: {score:.4f} - {doc}")
Core API
Client Operations
import fastpydb
# Create client (data stored in ./vectordb by default)
client = fastpydb.Client(path="./my_data")
# Create a new collection
collection = client.create_collection("documents")
# Get existing collection
collection = client.get_collection("documents")
# Get or create (safe for repeated calls)
collection = client.get_or_create_collection("documents")
# List all collections
print(client.list_collections())
# Delete a collection
client.delete_collection("documents")
# Save all data to disk
client.persist()
Collection Operations
# Add documents
collection.add(
documents=["Hello world", "Goodbye world"],
ids=["doc1", "doc2"],
metadatas=[{"source": "greeting"}, {"source": "farewell"}]
)
# Add with pre-computed embeddings
collection.add(
embeddings=[[0.1, 0.2, ...], [0.3, 0.4, ...]],
ids=["doc1", "doc2"]
)
# Upsert (add or update)
collection.upsert(
documents=["Updated document"],
ids=["doc1"]
)
# Query/Search
results = collection.query(
query_texts="search query", # or query_embeddings=[...]
n_results=10,
where={"category": "tech"} # optional filter
)
# Access results
print(results.ids) # [["id1", "id2", ...]]
print(results.documents) # [["doc1", "doc2", ...]]
print(results.distances) # [[0.1, 0.2, ...]]
print(results.metadatas) # [[{...}, {...}, ...]]
# Get by ID
result = collection.get(ids=["doc1", "doc2"])
result = collection.get(where={"category": "tech"}, limit=10)
# Update existing documents
collection.update(
ids=["doc1"],
documents=["New content"],
metadatas=[{"version": 2}]
)
# Delete documents
collection.delete(ids=["doc1", "doc2"])
collection.delete(where={"category": "old"})
# Get count
print(f"Documents: {collection.count}")
# Peek at sample
sample = collection.peek(limit=5)
Using Different Embedding Models
# Local embeddings (default) - no API key needed
collection = client.create_collection(
name="docs",
embedding_model="all-MiniLM-L6-v2" # Fast, 384 dimensions
)
# Higher quality local model
collection = client.create_collection(
name="docs",
embedding_model="all-mpnet-base-v2" # Better quality, 768 dimensions
)
# OpenAI embeddings (requires OPENAI_API_KEY)
collection = client.create_collection(
name="docs",
embedding_model="text-embedding-3-small",
embedding_provider="openai"
)
# Cohere embeddings (requires COHERE_API_KEY)
collection = client.create_collection(
name="docs",
embedding_model="embed-english-v3.0",
embedding_provider="cohere"
)
Filtering
# Simple equality filter
results = collection.query(
query_texts="search",
where={"category": "tech"}
)
# Multiple conditions (AND)
results = collection.query(
query_texts="search",
where={"category": "tech", "year": 2024}
)
# Using Filter class for complex queries
from fastpydb import Filter
results = collection.query(
query_texts="search",
filter=Filter.and_([
Filter.eq("category", "tech"),
Filter.gte("score", 0.8),
Filter.in_("status", ["published", "draft"])
])
)
Examples
See the examples/ directory for complete examples:
# Run quickstart examples
python examples/quickstart.py
# RAG application demo
python examples/rag_demo.py
# News intelligence demo
python examples/news_intelligence_demo.py
Advanced Usage
For advanced features like quantization, parallel search, and knowledge graphs, see below.
Features
- Simple API — ChromaDB-like interface for easy adoption
- Multiple Embeddings — Local (Sentence Transformers), OpenAI, Cohere
- HNSW Indexing — Sub-millisecond approximate nearest neighbor search
- Metadata Filtering — Filter by any metadata field
- Quantization — 4-32x memory compression with scalar, binary, and product quantizers
- Parallel Search — Multi-core BLAS/GEMM acceleration (67x speedup)
- Knowledge Graph — Nodes, edges, traversal, and Cypher-like queries
- Hybrid Search — Combined vector similarity + graph relationships
- REST API — FastAPI server with WebSocket real-time updates
- Persistence — Save/load to disk with automatic recovery
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ FastPyDB System │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ fastpydb/ │ │ quantization │ │
│ │ client.py │ │ .py │ │
│ │ ────────────── │ │ ────────────── │ │
│ │ • Client │ │ • Scalar (4x) │ │
│ │ • Collection │ │ • Binary (32x) │ │
│ │ • Simple API │ │ • Product (8x) │ │
│ │ • Auto-embed │ │ │ │
│ └────────┬─────────┘ └────────┬─────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────────────────────────────┐ │
│ │ vectordb_optimized.py │ │
│ │ ────────────────────────────────────────│ │
│ │ • VectorDB / Collection (Core) │ │
│ │ • HNSW Index │ │
│ │ • Filter Engine │ │
│ └────────────────────┬─────────────────────┘ │
│ │ │
│ ┌───────────┴───────────┐ │
│ ▼ ▼ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ graph.py │ │ parallel_search │ │
│ │ ────────────── │ │ .py │ │
│ │ • GraphDB │ │ ────────────── │ │
│ │ • Nodes/Edges │ │ • Multi-core │ │
│ │ • Traversal │ │ • Memory-mapped │ │
│ └──────────────────┘ └──────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Low-Level API
For more control, you can use the low-level API directly:
Basic Vector Database
from vectordb_optimized import VectorDB, Filter
import numpy as np
# Create database
db = VectorDB("./my_database")
collection = db.create_collection("documents", dimensions=384)
# Insert vectors
vector = np.random.randn(384).astype(np.float32)
collection.insert(vector, id="doc1", metadata={"category": "tech", "author": "Alice"})
# Batch insert (faster)
vectors = np.random.randn(1000, 384).astype(np.float32)
ids = [f"doc_{i}" for i in range(1000)]
metadata_list = [{"category": "tech"} for _ in range(1000)]
collection.insert_batch(vectors, ids, metadata_list)
# Search
query = np.random.randn(384).astype(np.float32)
results = collection.search(query, k=10)
for r in results:
print(f"ID: {r.id}, Score: {r.score:.4f}")
# Filtered search
results = collection.search(
query, k=10,
filter=Filter.eq("category", "tech")
)
# Save to disk
db.save()
Memory Compression with Quantization
from quantization import ScalarQuantizer, BinaryQuantizer
import numpy as np
vectors = np.random.randn(100000, 384).astype(np.float32)
# Scalar Quantization: 4x compression, 97%+ recall
sq = ScalarQuantizer(dimensions=384)
sq.train(vectors)
quantized = sq.encode(vectors)
print(f"Original: {vectors.nbytes / 1e6:.1f} MB")
print(f"Quantized: {quantized.nbytes / 1e6:.1f} MB")
# Search with quantized vectors
query = np.random.randn(384).astype(np.float32)
distances = sq.distances_l2(query, quantized)
top_k = np.argpartition(distances, 10)[:10]
# Binary Quantization: 32x compression, ultra-fast hamming distance
bq = BinaryQuantizer(dimensions=384)
bq.train(vectors)
binary = bq.encode(vectors)
distances = bq.distances_hamming(query, binary)
Parallel Processing for Large Datasets
from parallel_search import ParallelSearchEngine, MemoryMappedVectors
import numpy as np
vectors = np.random.randn(1000000, 128).astype(np.float32)
query = np.random.randn(128).astype(np.float32)
# Parallel search with BLAS (67x faster than naive)
engine = ParallelSearchEngine(n_workers=8)
results = engine.search_parallel(query, vectors, k=10, metric="cosine")
# Batch search with GEMM (2x faster for multiple queries)
queries = np.random.randn(100, 128).astype(np.float32)
all_results = engine.search_batch_parallel(queries, vectors, k=10)
# Memory-mapped for datasets larger than RAM
mmap = MemoryMappedVectors("./large_dataset", dimensions=128)
mmap.create(n_vectors=100_000_000)
mmap.append_batch(vectors)
results = mmap.search_parallel(query, k=10, engine=engine)
Knowledge Graph
from graph import GraphDB, NodeBuilder, EdgeBuilder
graph = GraphDB()
# Create nodes
graph.create_node(
NodeBuilder("user_1")
.label("User")
.property("name", "Alice")
.property("role", "engineer")
.build()
)
graph.create_node(
NodeBuilder("doc_1")
.label("Document")
.property("title", "Vector DB Guide")
.build()
)
# Create relationships
graph.create_edge(
EdgeBuilder("user_1", "doc_1", "AUTHORED")
.property("date", "2024-01-15")
.build()
)
# Query neighbors
neighbors = graph.neighbors("user_1", direction="out")
for node in neighbors:
print(f"Connected to: {node.id}")
# Traverse graph
paths = graph.traverse("user_1", max_depth=3)
Embeddings Integration
from embeddings import get_embedder, EmbeddingCollection
from vectordb_optimized import VectorDB
# Auto-detect embedder (uses OPENAI_API_KEY if set, else local)
embedder = get_embedder()
# Or specify provider
embedder = get_embedder("openai", model="text-embedding-3-small")
embedder = get_embedder("sentence-transformers", model="all-MiniLM-L6-v2")
# Create embedding-aware collection
db = VectorDB("./my_db")
collection = db.create_collection("docs", dimensions=embedder.dimensions)
docs = EmbeddingCollection(collection, embedder)
# Insert with automatic embedding
docs.add("doc1", "Machine learning is fascinating", {"category": "AI"})
# Search with text query
results = docs.search("artificial intelligence", k=5)
REST API Server
# Start server
uvicorn server:app --reload --port 8000
# Or run directly
python server.py
# Client usage
from client import VectorDBClient
client = VectorDBClient("http://localhost:8000")
# Create collection
client.create_collection("docs", dimensions=384)
# Insert
client.insert("docs", vector=[0.1, 0.2, ...], metadata={"title": "Hello"})
# Search
results = client.search("docs", vector=[0.1, 0.2, ...], k=10)
Module Reference
| Module | Description |
|---|---|
fastpydb/ |
High-level client API (recommended) |
vectordb_optimized.py |
Core vector database with HNSW indexing |
quantization.py |
Scalar, binary, and product quantization |
parallel_search.py |
Multi-core search engine and memory-mapped vectors |
graph.py |
Knowledge graph with nodes, edges, and traversal |
hybrid_search.py |
Combined vector + graph search |
embeddings.py |
OpenAI, Sentence Transformers, Cohere integration |
server.py |
FastAPI REST server |
client.py |
Python HTTP client |
realtime.py |
WebSocket real-time subscriptions |
Filter Operations
from fastpydb import Filter
Filter.eq("field", value) # Equal
Filter.ne("field", value) # Not equal
Filter.gt("field", value) # Greater than
Filter.gte("field", value) # Greater than or equal
Filter.lt("field", value) # Less than
Filter.lte("field", value) # Less than or equal
Filter.in_("field", [values]) # In list
Filter.contains("field", "sub") # Contains substring
Filter.regex("field", "^pat.*") # Regex match
Filter.and_([f1, f2]) # AND
Filter.or_([f1, f2]) # OR
Filter.not_(f1) # NOT
Quantization Comparison
| Quantizer | Compression | Recall | Speed | Use Case |
|---|---|---|---|---|
ScalarQuantizer |
4x | 97%+ | Moderate | Production (balanced) |
BinaryQuantizer |
32x | ~85% | Very Fast | Ultra-fast filtering |
ProductQuantizer |
8-16x | ~90% | Fast | Research/Analytics |
Benchmarks
Performance on 100K vectors, 128 dimensions:
| Method | Latency | QPS | Memory |
|---|---|---|---|
| Naive Python | 450 ms | 2 | 48 MB |
| Vectorized BLAS | 6 ms | 167 | 48 MB |
| HNSW | 0.17 ms | 5,773 | 48 MB |
| Scalar Quantized | 6 ms | 167 | 12 MB |
| Binary Quantized | 0.8 ms | 1,250 | 1.5 MB |
Speedup vs Naive (100K vectors):
| Method | Speedup |
|---|---|
| BLAS | 89x |
| Parallel Engine | 87x |
| Batch GEMM | 267x |
| HNSW | 2,388x |
| Hybrid | 939x |
Running Benchmarks
# Quick benchmark (10K vectors)
python examples/benchmark.py --quick
# Standard benchmark (100K vectors)
python examples/benchmark.py --medium
# Stress test (1M vectors)
python examples/benchmark.py --stress
# Parallel search benchmark
python examples/benchmark_parallel.py
# Quantization benchmark
python examples/benchmark_quantization.py
Performance Tuning
HNSW Parameters
| Parameter | Default | Description | Tradeoff |
|---|---|---|---|
M |
16 | Connections per node | Higher = better recall, more memory |
ef_construction |
200 | Build quality | Higher = better index, slower build |
ef_search |
50 | Search quality | Higher = better recall, slower search |
collection = db.create_collection(
"docs",
dimensions=384,
M=32,
ef_construction=400,
)
collection.set_ef_search(100)
Memory vs Speed Guidelines
| Dataset Size | Recommendation |
|---|---|
| < 100K vectors | HNSW only |
| 100K - 1M | HNSW + Scalar quantization |
| 1M - 10M | Memory-mapped + HNSW |
| > 10M | Memory-mapped + Binary quantization + HNSW candidates |
Requirements
Core:
numpy>=1.24.0
hnswlib>=0.8.0
Optional:
sentence-transformers>=2.2.0 # Local embeddings
openai>=1.0.0 # OpenAI embeddings
fastapi>=0.109.0 # REST API
uvicorn>=0.27.0 # ASGI server
License
MIT License - see LICENSE for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fastpyvectordb-0.1.0.tar.gz.
File metadata
- Download URL: fastpyvectordb-0.1.0.tar.gz
- Upload date:
- Size: 83.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
039ccd0f193d9af1cf5f3c366cfd851f2189d5e1fbc6fea9ac18859ccf45b234
|
|
| MD5 |
c003d294604c3507cf4d574e475c89e0
|
|
| BLAKE2b-256 |
1f15362ea9bd4c8c4cca092ef39a535c290b4f816de61d0106ed8559b52e2739
|
File details
Details for the file fastpyvectordb-0.1.0-py3-none-any.whl.
File metadata
- Download URL: fastpyvectordb-0.1.0-py3-none-any.whl
- Upload date:
- Size: 87.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3c56d03590c3c6450b81d3ca3cded8599b7c8043f0c6d8fbf76ff6511a1447eb
|
|
| MD5 |
e94bf33d7407a5811d55440d93eaa678
|
|
| BLAKE2b-256 |
ec28758b8489e25b796b524dc12db07dda842622a260da86a11b593d92d09411
|