Universal vector search wrapper for Postgres, MySQL, MariaDB, SQLite, DuckDB, ClickHouse, Redis, RethinkDB (pgvector, HeatWave, sqlite-vss, DuckDB VSS, ClickHouse ANN, RediSearch, Real-time changefeeds)
Project description
vectorwrap 0.8.0
Universal vector search wrapper for Postgres, MySQL, SQLite, DuckDB, ClickHouse (pgvector, HeatWave, sqlite-vss, DuckDB VSS, ClickHouse ANN).
Switch between PostgreSQL, MySQL, SQLite, DuckDB, and ClickHouse vector backends with a single line of code. Perfect for prototyping, testing, and production deployments.
Stable API - Core methods follow semantic versioning with backward compatibility guarantees.
Quick Start
# Core install (PostgreSQL + MySQL support)
pip install vectorwrap
# Add SQLite support (requires system SQLite with extension support)
pip install "vectorwrap[sqlite]"
# Add DuckDB support (includes VSS extension)
pip install "vectorwrap[duckdb]"
# Add ClickHouse support (includes clickhouse-connect)
pip install "vectorwrap[clickhouse]"
# Install all backends for development
pip install "vectorwrap[sqlite,duckdb,clickhouse]"
from vectorwrap import VectorDB
# Your embedding function (use OpenAI, Hugging Face, etc.)
def embed(text: str) -> list[float]:
# Return your 1536-dim embeddings here
return [0.1, 0.2, ...]
# Connect to any supported database
db = VectorDB("postgresql://user:pass@host/db") # or mysql://... or sqlite:///path.db or duckdb:///path.db or clickhouse://...
db.create_collection("products", dim=1536)
# Insert vectors with metadata
db.upsert("products", 1, embed("Apple iPhone 15 Pro"), {"category": "phone", "price": 999})
db.upsert("products", 2, embed("Samsung Galaxy S24"), {"category": "phone", "price": 899})
# Semantic search with filtering
results = db.query(
collection="products",
query_vector=embed("latest smartphone"),
top_k=5,
filter={"category": "phone"}
)
print(results) # → [(1, 0.023), (2, 0.087)]
Supported Backends
| Database | Vector Type | Indexing | Installation | Notes |
|---|---|---|---|---|
| PostgreSQL 16+ + pgvector | VECTOR(n) |
HNSW | CREATE EXTENSION vector; |
Production ready |
| MySQL 8.2+ HeatWave | VECTOR(n) |
Automatic | Built-in | Native vector support |
| MySQL ≤8.0 (legacy) | JSON arrays | None | Built-in | Slower, Python distance |
| MariaDB 11.8+ GA LTS | VECTOR(n) |
HNSW | Built-in | Native vectors, 10M+ users |
| MariaDB <11.8 (legacy) | JSON arrays | None | Built-in | Auto-fallback, Python distance |
| SQLite + sqlite-vss | Virtual table | HNSW | pip install "vectorwrap[sqlite]" |
Great for prototyping |
| DuckDB + VSS | FLOAT[] arrays |
HNSW | pip install "vectorwrap[duckdb]" |
Analytics + vectors |
| ClickHouse | Array(Float32) |
HNSW | pip install "vectorwrap[clickhouse]" |
High-performance analytics |
| Redis + RediSearch | Binary vectors | HNSW/FLAT | pip install "vectorwrap[redis]" |
Ultra-fast in-memory search |
| RethinkDB | JSON arrays | In-memory HNSW | pip install "vectorwrap[rethinkdb]" |
WORLD'S FIRST: Real-time changefeeds |
Examples
Complete Example with OpenAI Embeddings
from openai import OpenAI
from vectorwrap import VectorDB
client = OpenAI()
def embed(text: str) -> list[float]:
response = client.embeddings.create(
model="text-embedding-3-small",
input=text
)
return response.data[0].embedding
# Use any database - just change the connection string!
db = VectorDB("postgresql://user:pass@localhost/vectors")
db.create_collection("documents", dim=1536)
# Add some documents
documents = [
("Python is a programming language", {"topic": "programming"}),
("Machine learning uses neural networks", {"topic": "ai"}),
("Databases store structured data", {"topic": "data"}),
]
for i, (text, metadata) in enumerate(documents):
db.upsert("documents", i, embed(text), metadata)
# Search for similar content
query = "What is artificial intelligence?"
results = db.query("documents", embed(query), top_k=2)
for doc_id, distance in results:
print(f"Document {doc_id}: distance={distance:.3f}")
Database-Specific Connection Strings
# PostgreSQL with pgvector
db = VectorDB("postgresql://user:password@localhost:5432/mydb")
# MySQL (8.2+ with native vectors or legacy JSON mode)
db = VectorDB("mysql://user:password@localhost:3306/mydb")
# SQLite (local file or in-memory)
db = VectorDB("sqlite:///./vectors.db")
db = VectorDB("sqlite:///:memory:")
# DuckDB (local file or in-memory)
db = VectorDB("duckdb:///./vectors.db")
db = VectorDB("duckdb:///:memory:")
# ClickHouse (local or remote)
db = VectorDB("clickhouse://default@localhost:8123/default")
db = VectorDB("clickhouse://user:password@host:port/database")
API Reference
VectorDB(connection_string: str) - Stable
Create a vector database connection.
create_collection(name: str, dim: int) - Stable
Create a new collection for vectors of dimension dim.
upsert(collection: str, id: int, vector: list[float], metadata: dict = None) - Stable
Insert or update a vector with optional metadata.
query(collection: str, query_vector: list[float], top_k: int = 5, filter: dict = None) - Stable
Find the top_k most similar vectors. Returns list of (id, distance) tuples.
Filtering Support:
- PostgreSQL & MySQL: Native SQL filtering
- SQLite: Adaptive oversampling (fetches more results, then filters)
- DuckDB: Native JSON filtering with SQL predicates
- ClickHouse: Native JSON filtering with JSONExtract functions
API Stability
vectorwrap follows semantic versioning and maintains API stability:
Stable APIs (No breaking changes in minor versions)
- Core Interface:
VectorDB()constructor and connection string format - Collection Management:
create_collection(name, dim) - Data Operations:
upsert(collection, id, vector, metadata)andquery(collection, query_vector, top_k, filter) - Return Formats: Query results as
[(id, distance), ...]tuples
Evolving APIs (May change in minor versions with deprecation warnings)
- Backend-specific optimizations: Index configuration, distance metrics
- Advanced filtering: Complex filter syntax beyond simple key-value pairs
- Batch operations: Bulk insert/update methods (planned)
Experimental (May change without notice)
- New backends: Recently added database support may have API refinements
- Extension methods: Database-specific functionality not in core API
Version Compatibility Promise
- Patch versions (0.3.1 → 0.3.2): Only bug fixes, no API changes
- Minor versions (0.3.x → 0.4.0): New features, deprecated APIs get warnings
- Major versions (0.x → 1.0): Breaking changes allowed, migration guide provided
Current Status: v0.4.0 - Stable release with API backward compatibility guarantees
Installation Notes
SQLite Setup
SQLite support requires loadable extensions. On some systems you may need:
# macOS with Homebrew
brew install sqlite
export LDFLAGS="-L$(brew --prefix sqlite)/lib"
export CPPFLAGS="-I$(brew --prefix sqlite)/include"
pip install "vectorwrap[sqlite]"
# Or use system package manager
# Ubuntu: apt install libsqlite3-dev
# CentOS: yum install sqlite-devel
PostgreSQL Setup
-- Enable pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;
MySQL Setup
MySQL 8.2+ has native VECTOR type support. For older versions, vectorwrap automatically falls back to JSON storage with Python-based distance calculations.
MariaDB Setup
MariaDB 11.8 GA LTS introduced native VECTOR data type with HNSW indexing, similar to pgvector. For older versions, vectorwrap automatically falls back to JSON storage.
# MariaDB 11.8+ (native VECTOR support)
db = VectorDB("mariadb://user:pass@localhost:3306/vectordb")
db.create_collection("embeddings", dim=1536) # Uses VECTOR(1536) with HNSW
# Older versions automatically use JSON fallback
# No code changes needed - version detection is automatic
DuckDB Setup
DuckDB includes the VSS extension by default since v0.10.2. The extension provides HNSW indexing for fast vector similarity search:
# Works out of the box with vectorwrap[duckdb]
db = VectorDB("duckdb:///analytics.db")
db.create_collection("embeddings", dim=1536) # Auto-creates HNSW index
ClickHouse Setup
ClickHouse provides native support for vector similarity search using ANN indexes:
# Works with vectorwrap[clickhouse]
db = VectorDB("clickhouse://default@localhost:8123/default")
db.create_collection("embeddings", dim=1536) # Auto-creates HNSW index
Note: ClickHouse vector similarity indexes require ClickHouse version 25.8+ with the experimental feature enabled. The backend automatically handles this configuration.
Use Cases
- Prototyping: Start with SQLite or DuckDB, scale to PostgreSQL or ClickHouse
- Testing: Use in-memory databases (SQLite/DuckDB) for fast tests
- Analytics: DuckDB or ClickHouse for combining vector search with analytical queries
- Multi-tenant: Different customers on different database backends
- Migration: Move vector data between database systems seamlessly
- Hybrid deployments: PostgreSQL for production, DuckDB/ClickHouse for analytics
- High-performance: ClickHouse for large-scale vector search workloads
Integrations
vectorwrap integrates with popular AI frameworks and platforms:
- Appwrite: Add AI/vector capabilities to Appwrite apps (uses MariaDB backend) - Testing Guide
- LangChain: Drop-in VectorStore adapter for RAG pipelines
- LlamaIndex: VectorStore wrapper for data frameworks
- Supabase: Managed PostgreSQL + pgvector helper
- Milvus: Enterprise vector database adapter
- Qdrant: Cloud-native vector search integration
# Install with integrations
pip install "vectorwrap[langchain]"
pip install "vectorwrap[llamaindex]"
pip install "vectorwrap[milvus]"
pip install "vectorwrap[qdrant]"
Example with Appwrite (No External Vector DB Needed):
from vectorwrap.integrations.appwrite import AppwriteVectorStore
# Add vector search to your Appwrite app
vector_store = AppwriteVectorStore.from_connection_string(
connection_url="mariadb://appwrite:password@localhost:3306/appwrite",
collection_name="embeddings",
dimension=1536
)
# Store and search vectors in Appwrite's MariaDB
vector_store.add_documents([
{"text": "Hello world", "metadata": {"source": "doc1"}}
], embedding_function=embed_fn)
results = vector_store.search("greeting", embed_fn, top_k=5)
Example with LangChain:
from langchain.embeddings import OpenAIEmbeddings
from vectorwrap.integrations.langchain import VectorwrapStore
embeddings = OpenAIEmbeddings()
vectorstore = VectorwrapStore(
connection_url="postgresql://user:pass@localhost/db",
collection_name="documents",
embedding_function=embeddings
)
vectorstore.add_texts(["Hello world", "LangChain + vectorwrap"])
results = vectorstore.similarity_search("greeting", k=5)
See docs/INTEGRATIONS.md for complete integration guide.
Benchmarks
Comprehensive performance benchmarks are available in the bench/ directory.
Quick benchmark:
pip install "vectorwrap[all]" matplotlib
python bench/benchmark.py
python bench/visualize.py benchmark_results.json
See bench/README.md for detailed benchmarking guide.
Roadmap
v1.0 Stable Release
- API Freeze: Lock stable APIs with full backward compatibility
- Production Testing: Comprehensive benchmarks across all backends [DONE]
- Documentation: Complete API docs and migration guides
Future Features
- Redis with RediSearch
- Elasticsearch with dense vector fields
- Qdrant and Weaviate support
- Batch operations for bulk inserts
- Index configuration options
- Distance metrics: Cosine, dot product, custom functions
License
MIT © 2025 Mihir Ahuja
If vectorwrap saved you time, please star the repo – it helps others discover it!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vectorwrap-0.8.0.tar.gz.
File metadata
- Download URL: vectorwrap-0.8.0.tar.gz
- Upload date:
- Size: 50.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
34b0966db648b33606e990e4fbb716b8b97bd9ca6afe8ecfd2e65291f50ac0a0
|
|
| MD5 |
db65e314ccd957322f1137f2aad47aa1
|
|
| BLAKE2b-256 |
5bafa001bb1c6574de245b734a6a58b8fa5e816884fd1a3d3b83e65cbd048257
|
File details
Details for the file vectorwrap-0.8.0-py3-none-any.whl.
File metadata
- Download URL: vectorwrap-0.8.0-py3-none-any.whl
- Upload date:
- Size: 48.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4c1aff3711b22784fc91d8e32688c9277c8a63ecb74e210a36b27dabca243490
|
|
| MD5 |
935c79b58035fc1872220a89928c7e3f
|
|
| BLAKE2b-256 |
9603a50d995eb37ecb14c293b1cc7bd7ad8975dcb4cef62c114cf61bdbac297e
|