Skip to main content

Where vectors come alive - A lightweight, visual-first vector database with embedded ML models

Project description

VectrixDB

License: Apache 2.0 Python Versions VectrixDB Version Downloads Issues Contact

Where vectors come alive.

A lightweight vector database with embedded ML models, beautiful dashboard, and GraphRAG - no API keys required.


Features

  • 4 Search Modes - Dense, Hybrid, Ultimate, and Graph (GraphRAG)
  • Embedded Models - Works offline with bundled ONNX models
  • Model Selection - Choose from bundled, HuggingFace, or GitHub release models
  • Visual Dashboard - Built-in web UI for managing collections
  • Zero Config - Just pip install and start using

Installation

pip install vectrixdb

Quick Start

from vectrixdb import Vectrix

db = Vectrix("my_docs")
db.add(["Python is great", "JavaScript powers the web", "Rust is fast"])

results = db.search("programming")
print(results.top.text)

Search Modes

VectrixDB offers 4 search modes, each building on the previous:

Mode Components Best For
dense Vector similarity Fast semantic search
hybrid Dense + Sparse + Reranker Keyword + semantic matching
ultimate Hybrid + ColBERT Maximum accuracy
graph Ultimate + Knowledge Graph Complex reasoning (GraphRAG)
# Choose your mode
db = Vectrix("docs", mode="dense")     # Fastest
db = Vectrix("docs", mode="hybrid")    # Balanced
db = Vectrix("docs", mode="ultimate")  # Best quality
db = Vectrix("docs", mode="graph")     # GraphRAG

Model Selection

Customize models for each component. Models load from 3 sources:

1. Bundled Models (Offline)

Pre-packaged ONNX models that work without internet (~100MB total):

db = Vectrix(
    "docs",
    mode="ultimate",
    dense_model="e5-small",
    sparse_model="bm25",
    reranker_model="L12",
    late_interaction_model="colbert",
)
Component Alias Model Size
Dense e5-small intfloat/e5-small-v2 33MB
Sparse bm25 BM25 vocabulary 1MB
Reranker L12 ms-marco-MiniLM-L12-v2 33MB
ColBERT colbert answerai-colbert-small-v1 33MB

2. HuggingFace Models

Use any compatible model from HuggingFace (downloads on first use):

db = Vectrix(
    "docs",
    mode="hybrid",
    dense_model="BAAI/bge-large-en-v1.5",
    sparse_model="naver/splade-cocondenser-ensembledistil",
    reranker_model="cross-encoder/ms-marco-MiniLM-L-12-v2",
)

Compatible models:

  • Dense: BAAI/bge-large-en-v1.5, intfloat/e5-large-v2, sentence-transformers/all-mpnet-base-v2
  • Sparse: naver/splade-cocondenser-ensembledistil
  • Reranker: cross-encoder/ms-marco-MiniLM-L-12-v2, BAAI/bge-reranker-base
  • ColBERT: jinaai/jina-colbert-v2, colbert-ir/colbertv2.0

3. GitHub Release Models

Larger models hosted on GitHub releases (auto-downloaded on first use):

db = Vectrix(
    "docs",
    mode="ultimate",
    dense_model="github:bge-small",
    sparse_model="github:splade",
    reranker_model="github:reranker-l6",
    late_interaction_model="github:bge-m3",
)
Tag Model Type Languages Size
github:bge-small BAAI/bge-small-en-v1.5 Dense EN 127MB
github:e5-small intfloat/e5-small-v2 FP32 Dense EN 127MB
github:dense-multi multilingual-e5-small Dense 100+ 113MB
github:splade SPLADE++ Sparse EN 508MB
github:reranker-l6 ms-marco-MiniLM-L6-v2 Reranker EN 87MB
github:reranker-multi mMiniLMv2-L12 Reranker 15+ 113MB
github:bge-m3 BGE-M3 ColBERT 100+ 563MB

Metadata & Filtering

db.add(
    texts=["iPhone 15", "Galaxy S24", "Pixel 8"],
    metadata=[
        {"brand": "Apple", "price": 999},
        {"brand": "Samsung", "price": 899},
        {"brand": "Google", "price": 699}
    ]
)

results = db.search("smartphone", filter={"brand": "Apple"})

Storage Backends

Use external storage backends (Lakebase, DeltaLake, CosmosDB) with full search mode support:

from vectrixdb import Vectrix, VectrixDB

# Connect to Lakebase (PostgreSQL + pgvector)
lakebase = VectrixDB.with_lakebase(
    host="your-lakebase-host",
    database="vectrixdb",
    user="your-user",
    password="your-password",
)

# Use Vectrix with storage backend + ultimate mode
db = Vectrix(
    "products",
    mode="ultimate",
    dense_model="bge-small",
    sparse_model="splade",
    reranker_model="L6",
    late_interaction_model="colbert",
    storage_backend=lakebase,
)

db.add(texts=["Product A", "Product B"])
results = db.search("query")  # Full ultimate search from Lakebase

Adaptive Schema

Schema adapts based on selected mode:

Mode Columns Created
dense dense_embedding
hybrid dense_embedding + sparse_embedding
ultimate dense_embedding + sparse_embedding + late_interaction_embedding
graph Same as ultimate + graph tables

All modes store text_content for reranker (computed at query time).

REST API

Start the server:

VECTRIXDB_API_KEY=your_secret vectrixdb serve --port 7337

Open the dashboard at http://localhost:7337/dashboard

API Examples

# Create collection
curl -X POST http://localhost:7337/api/v1/collections \
  -H "Content-Type: application/json" \
  -H "api-key: your_secret" \
  -d '{"name": "docs", "dimension": 384}'

# Add documents (auto-embedding)
curl -X POST http://localhost:7337/api/v1/collections/docs/text-upsert \
  -H "Content-Type: application/json" \
  -H "api-key: your_secret" \
  -d '{"points": [{"id": "1", "text": "Hello world"}]}'

# Search
curl -X POST http://localhost:7337/api/v1/collections/docs/text-search \
  -H "Content-Type: application/json" \
  -H "api-key: your_secret" \
  -d '{"query_text": "greeting", "limit": 10}'

Project Structure

VectrixDB/
├── vectrixdb/
│   ├── core/           # Vector index, storage, search
│   │   ├── graphrag/   # Knowledge graph
│   │   └── search/     # Search algorithms
│   ├── api/            # FastAPI server
│   ├── models/         # Embedded ONNX models
│   ├── dashboard/      # Web UI
│   └── cli.py          # Command line
├── tests/
└── requirements.txt

Install from Source

git clone https://github.com/knowusuboaky/VectrixDB.git
cd VectrixDB
pip install -e .

Requirements

  • Python 3.9+
  • No API keys needed
  • Models are bundled or auto-downloaded

License

Apache 2.0

Author

Kwadwo Daddy Nyame Owusu - Boakye

GitHub: @knowusuboaky

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vectrixdb-1.9.5.tar.gz (68.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vectrixdb-1.9.5-py3-none-any.whl (68.7 MB view details)

Uploaded Python 3

File details

Details for the file vectrixdb-1.9.5.tar.gz.

File metadata

  • Download URL: vectrixdb-1.9.5.tar.gz
  • Upload date:
  • Size: 68.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vectrixdb-1.9.5.tar.gz
Algorithm Hash digest
SHA256 09f343bf853856dabad40841c134daa6c4f372716b6c8bd09bf09541041f6160
MD5 76cc9d80e304d4089cc0fc3161330ce7
BLAKE2b-256 9253c70660b2df2003798e572c6aede99eb4e6b17eb4ef552e12efd18ef946d7

See more details on using hashes here.

File details

Details for the file vectrixdb-1.9.5-py3-none-any.whl.

File metadata

  • Download URL: vectrixdb-1.9.5-py3-none-any.whl
  • Upload date:
  • Size: 68.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vectrixdb-1.9.5-py3-none-any.whl
Algorithm Hash digest
SHA256 7c6e95b74ddb822157e3b1e86cb91fccec03fb71ba4db7655a7803986c575081
MD5 2c656e39493a7c5eafa0753493aed2fc
BLAKE2b-256 4688748af1d829166c1c260a55cb78fa5723fd696cc23a0fd9e3aea693cff778

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page