Skip to main content

Where vectors come alive - A lightweight, visual-first vector database with embedded ML models

Project description

VectrixDB

License: Apache 2.0 Python Versions VectrixDB Version Downloads Issues Contact

Where vectors come alive.

A lightweight vector database with embedded ML models, beautiful dashboard, and GraphRAG - no API keys required.


Features

  • 4 Search Modes - Dense, Hybrid, Ultimate, and Graph (GraphRAG)
  • Embedded Models - Works offline with bundled ONNX models
  • Model Selection - Choose from bundled, HuggingFace, or GitHub release models
  • Visual Dashboard - Built-in web UI for managing collections
  • Zero Config - Just pip install and start using

Installation

pip install vectrixdb

Quick Start

from vectrixdb import Vectrix

db = Vectrix("my_docs")
db.add(["Python is great", "JavaScript powers the web", "Rust is fast"])

results = db.search("programming")
print(results.top.text)

Search Modes

VectrixDB offers 4 search modes, each building on the previous:

Mode Components Best For
dense Vector similarity Fast semantic search
hybrid Dense + Sparse + Reranker Keyword + semantic matching
ultimate Hybrid + ColBERT Maximum accuracy
graph Ultimate + Knowledge Graph Complex reasoning (GraphRAG)
# Choose your mode
db = Vectrix("docs", mode="dense")     # Fastest
db = Vectrix("docs", mode="hybrid")    # Balanced
db = Vectrix("docs", mode="ultimate")  # Best quality
db = Vectrix("docs", mode="graph")     # GraphRAG

Model Selection

Customize models for each component. Models load from 3 sources:

1. Bundled Models (Offline)

Pre-packaged ONNX models that work without internet (~100MB total):

db = Vectrix(
    "docs",
    mode="ultimate",
    dense_model="e5-small",
    sparse_model="bm25",
    reranker_model="L12",
    late_interaction_model="colbert",
)
Component Alias Model Size
Dense e5-small intfloat/e5-small-v2 33MB
Sparse bm25 BM25 vocabulary 1MB
Reranker L12 ms-marco-MiniLM-L12-v2 33MB
ColBERT colbert answerai-colbert-small-v1 33MB

2. HuggingFace Models

Use any compatible model from HuggingFace (downloads on first use):

db = Vectrix(
    "docs",
    mode="hybrid",
    dense_model="BAAI/bge-large-en-v1.5",
    sparse_model="naver/splade-cocondenser-ensembledistil",
    reranker_model="cross-encoder/ms-marco-MiniLM-L-12-v2",
)

Compatible models:

  • Dense: BAAI/bge-large-en-v1.5, intfloat/e5-large-v2, sentence-transformers/all-mpnet-base-v2
  • Sparse: naver/splade-cocondenser-ensembledistil
  • Reranker: cross-encoder/ms-marco-MiniLM-L-12-v2, BAAI/bge-reranker-base
  • ColBERT: jinaai/jina-colbert-v2, colbert-ir/colbertv2.0

3. GitHub Release Models

Larger models hosted on GitHub releases (auto-downloaded on first use):

db = Vectrix(
    "docs",
    mode="ultimate",
    dense_model="github:bge-small",
    sparse_model="github:splade",
    reranker_model="github:reranker-l6",
    late_interaction_model="github:bge-m3",
)
Tag Model Type Languages Size
github:bge-small BAAI/bge-small-en-v1.5 Dense EN 127MB
github:e5-small intfloat/e5-small-v2 FP32 Dense EN 127MB
github:dense-multi multilingual-e5-small Dense 100+ 113MB
github:splade SPLADE++ Sparse EN 508MB
github:reranker-l6 ms-marco-MiniLM-L6-v2 Reranker EN 87MB
github:reranker-multi mMiniLMv2-L12 Reranker 15+ 113MB
github:bge-m3 BGE-M3 ColBERT 100+ 563MB

Metadata & Filtering

db.add(
    texts=["iPhone 15", "Galaxy S24", "Pixel 8"],
    metadata=[
        {"brand": "Apple", "price": 999},
        {"brand": "Samsung", "price": 899},
        {"brand": "Google", "price": 699}
    ]
)

results = db.search("smartphone", filter={"brand": "Apple"})

Storage Backends

Use external storage backends (Lakebase, DeltaLake, CosmosDB) with full search mode support:

from vectrixdb import Vectrix, VectrixDB

# Connect to Lakebase (PostgreSQL + pgvector)
lakebase = VectrixDB.with_lakebase(
    host="your-lakebase-host",
    database="vectrixdb",
    user="your-user",
    password="your-password",
)

# Use Vectrix with storage backend + ultimate mode
db = Vectrix(
    "products",
    mode="ultimate",
    dense_model="bge-small",
    sparse_model="splade",
    reranker_model="L6",
    late_interaction_model="colbert",
    storage_backend=lakebase,
)

db.add(texts=["Product A", "Product B"])
results = db.search("query")  # Full ultimate search from Lakebase

Adaptive Schema

Schema adapts based on selected mode:

Mode Columns Created
dense dense_embedding
hybrid dense_embedding + sparse_embedding
ultimate dense_embedding + sparse_embedding + late_interaction_embedding
graph Same as ultimate + graph tables

All modes store text_content for reranker (computed at query time).

REST API

Start the server:

VECTRIXDB_API_KEY=your_secret vectrixdb serve --port 7337

Open the dashboard at http://localhost:7337/dashboard

API Examples

# Create collection
curl -X POST http://localhost:7337/api/v1/collections \
  -H "Content-Type: application/json" \
  -H "api-key: your_secret" \
  -d '{"name": "docs", "dimension": 384}'

# Add documents (auto-embedding)
curl -X POST http://localhost:7337/api/v1/collections/docs/text-upsert \
  -H "Content-Type: application/json" \
  -H "api-key: your_secret" \
  -d '{"points": [{"id": "1", "text": "Hello world"}]}'

# Search
curl -X POST http://localhost:7337/api/v1/collections/docs/text-search \
  -H "Content-Type: application/json" \
  -H "api-key: your_secret" \
  -d '{"query_text": "greeting", "limit": 10}'

Project Structure

VectrixDB/
├── vectrixdb/
│   ├── core/           # Vector index, storage, search
│   │   ├── graphrag/   # Knowledge graph
│   │   └── search/     # Search algorithms
│   ├── api/            # FastAPI server
│   ├── models/         # Embedded ONNX models
│   ├── dashboard/      # Web UI
│   └── cli.py          # Command line
├── tests/
└── requirements.txt

Install from Source

git clone https://github.com/knowusuboaky/VectrixDB.git
cd VectrixDB
pip install -e .

Requirements

  • Python 3.9+
  • No API keys needed
  • Models are bundled or auto-downloaded

License

Apache 2.0

Author

Kwadwo Daddy Nyame Owusu - Boakye

GitHub: @knowusuboaky

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vectrixdb-1.9.2.tar.gz (68.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vectrixdb-1.9.2-py3-none-any.whl (68.7 MB view details)

Uploaded Python 3

File details

Details for the file vectrixdb-1.9.2.tar.gz.

File metadata

  • Download URL: vectrixdb-1.9.2.tar.gz
  • Upload date:
  • Size: 68.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vectrixdb-1.9.2.tar.gz
Algorithm Hash digest
SHA256 9df8ec66c1626a0ce8c30630dcc0f038414cc3e4e2b6f4a50f7073f873e3d049
MD5 1e9125aed46829a4ffcdac23723dff4f
BLAKE2b-256 44c99b4b774e0bedf78ed7279bf1fd427e2acfc636971cb798c0ae4e7fbedf87

See more details on using hashes here.

File details

Details for the file vectrixdb-1.9.2-py3-none-any.whl.

File metadata

  • Download URL: vectrixdb-1.9.2-py3-none-any.whl
  • Upload date:
  • Size: 68.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vectrixdb-1.9.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0f7ab65ba2ca3414a99d55db13b881e1e250958abb665e6693eab65f8c2730d8
MD5 0ac31ec4bd4e856ad1d0288c911186d6
BLAKE2b-256 6b5849e47b61a6bd931c567e076232a790e5ecef63d25e8d18fd2faeb22d3a72

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page