Skip to main content

High-performance, local RAG search engine and MCP/API server for Apple Silicon

Project description

⚡️ dbs-vector

A High-Performance, Arrow-Native Local Codebase Search Engine for Apple Silicon.

dbs-vector is a optimized Retrieval-Augmented Generation (RAG) search engine designed specifically for macOS (M-Series chips). It bypasses traditional Python serialization bottlenecks by utilizing Apple's Unified Memory Architecture (UMA) and pure Apache Arrow data pipelines.

It enables lightning-fast, hybrid (Vector + Full-Text) search across your local codebase, entirely offline.


✨ Features

  • Zero-Copy Memory Pipelines: Uses MLX to compute embeddings on the Mac GPU, casting the resulting tensors instantly into NumPy arrays via Unified Memory without costly float object instantiation.
  • Arrow-Native Storage: Uses LanceDB to stream ingestion batches directly to disk via PyArrow, avoiding the massive memory overhead of JSON and dictionary comprehensions.
  • Hybrid Retrieval: Simultaneously executes Approximate Nearest Neighbor (ANN) cosine vector search and native Tantivy Full-Text Search (FTS).
  • Code-Aware Chunking: Intelligently splits documentation and code, respecting markdown fences so that code blocks are indexed as atomic units.
  • Production Robustness: Features dynamic IVF_PQ indexing, Rust-level predicate pushdown (metadata filtering), and dataset compaction for delta-updates.
  • Remote SQL API Ingestion: ApiChunker pulls pre-aggregated slow-query records from any networked backend over HTTP, replacing local files with a paginated REST API — no changes to the embedding or storage layers.

🚀 Installation

This project is built using uv, an extremely fast Python package manager.

  1. Clone the repository:

    git clone https://github.com/dbsmedya/dbs-vector.git
    cd dbs-vector
    
  2. Install the CLI package:

    uv sync
    

    This automatically sets up the environment and creates the dbs-vector executable in your path.

    Optional extras unlock additional ingestion sources:

    uv sync --extra sql  # DuckDB ingestion
    uv sync --extra api  # Remote HTTP API ingestion
    

💻 Usage

The application is entirely configuration-driven via config.yaml. It supports multiple data types (Engines) such as Markdown and SQL.

Global Options

  • --config-file / -c: Path to your custom config.yaml (Defaults to ./config.yaml).

Ingesting Documents

Index markdown files, JSON SQL logs, DuckDB analytical files, or a remote HTTP slow-query API into the local vector store.

# Ingest all markdown files (default)
uv run dbs-vector ingest "docs/"

# Ingest SQL slow query logs (JSON format)
uv run dbs-vector ingest "slow_queries.json" --type sql

# Ingest SQL slow queries from DuckDB (High-Performance Columnar)
uv run dbs-vector ingest "slow_queries.duckdb" --type sql --rebuild

# Ingest from a remote HTTP API (paginated GET)
uv run dbs-vector ingest "https://slow-log-api.internal/api/v1" --type sql-api

# Ingest via a custom SELECT sent to the remote API
uv run dbs-vector ingest "https://slow-log-api.internal/api/v1" --type sql-api \
  --query "SELECT fingerprint_id AS id, sanitized_sql AS text, db AS source, ..."

Searching the Codebase

Execute queries against your chosen engine.

# Semantic hybrid search across markdown
uv run dbs-vector search "What is MLX?"

# Find similar slow queries (SQL clustering)
uv run dbs-vector search "SELECT * FROM users" --type sql --min-time 1000

Indexes are built automatically at the end of every ingest run. Two indexes are created:

  • IVF_PQ vector index (only when the table has > 256 rows)
  • Tantivy FTS inverted index (required for hybrid search)

If you see a "Cannot perform full text search unless an INVERTED index has been created" error, it means the FTS index was never built for your table. Fix it by re-running ingestion — use --rebuild to wipe and re-index from scratch:

uv run dbs-vector ingest "docs/" --rebuild
uv run dbs-vector ingest "slow_queries.json" --type sql --rebuild

For detailed specifications on each ingestion source, see: 👉 SQL Engine Documentation 👉 DuckDB Ingestion Documentation 👉 Remote SQL API Ingestion

Async API Server

The application includes a high-performance FastAPI server to expose the search engine over HTTP.

# Start the API server (loads all engines defined in config.yaml)
uv run dbs-vector serve

For full API specifications and swagger documentation, see: 👉 API Usage & Documentation

Model Context Protocol (MCP) Server

dbs-vector includes a built-in MCP server compatible with Claude Desktop, Claude Code (CLI), and Cursor. Supports both stdio (no server required) and Streamable HTTP (shared instance, saves VRAM).

# stdio — each client spawns its own process
uv run dbs-vector mcp

# HTTP — one shared server for all clients
uv run dbs-vector serve   # MCP endpoint: http://127.0.0.1:8000/mcp

For setup instructions for all clients and transport types, see: 👉 MCP Server Documentation

🏗 Architecture & Roadmap

dbs-vector is built upon strict Clean Architecture and SOLID principles. It utilizes a Configuration-Driven Registry Pattern, allowing new data engines (e.g., LibCST, Logs) to be added by simply updating config.yaml and registering new mappers/chunkers without modifying core orchestration logic.

Specialized Gemma Workflows

The project is optimized for instruction-tuned models like embeddinggemma. It supports asymmetric task-based workflows defined in config.yaml:

  • Markdown (Search Result): Uses the task: search result prefix for queries and title: none | text: for documents, maximizing retrieval accuracy for RAG.
  • SQL (Clustering): Uses the task: clustering prefix for both ingestion and search, enabling high-precision semantic grouping of logically similar slow queries.

Future Hardware Support (CUDA/TPU)

Because the core RAG orchestration relies exclusively on the IEmbedder Protocol, the application is strictly hardware-agnostic at its core. While currently optimized for Apple Silicon via MLXEmbedder, future deployment to cloud GPUs or Linux environments simply requires implementing a new CudaEmbedder (using PyTorch/Transformers) that returns standard NumPy arrays. No changes to the ingestion, storage, or API layers are necessary to support new hardware accelerators. No access to a CUDA hardware at the moment.

For a deep dive into the engineering, the Apache Arrow ingestion lifecycle, and the blueprint for AST/LibCST integration, see the official documentation:

👉 Architecture & Engineering Documentation

🛠 Development

To contribute to dbs-vector, the project utilizes poethepoet as a task runner and implements strict quality gates (Ruff & Mypy).

# Run the entire validation suite (Format, Lint, Typecheck, Pytest)
uv run poe check

# Run tests with coverage
uv run poe test-cov

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbs_vector-0.5.3.tar.gz (253.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbs_vector-0.5.3-py3-none-any.whl (33.0 kB view details)

Uploaded Python 3

File details

Details for the file dbs_vector-0.5.3.tar.gz.

File metadata

  • Download URL: dbs_vector-0.5.3.tar.gz
  • Upload date:
  • Size: 253.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dbs_vector-0.5.3.tar.gz
Algorithm Hash digest
SHA256 5674a0862ca206f79b5b334c0d61e68e5e57f405b0123001fc2d30114acd7bfa
MD5 c6441699ae32aeefa7fd4ee5e77c7363
BLAKE2b-256 6bf3191a00f8e58814480d55cf8e6aa10b1277ea3dd09976879defa30770e8fe

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbs_vector-0.5.3.tar.gz:

Publisher: release.yml on dbsmedya/dbs-vector

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dbs_vector-0.5.3-py3-none-any.whl.

File metadata

  • Download URL: dbs_vector-0.5.3-py3-none-any.whl
  • Upload date:
  • Size: 33.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dbs_vector-0.5.3-py3-none-any.whl
Algorithm Hash digest
SHA256 8ba7cd658b8285d2c3911e4b0eb7fc8594cc176f7bd52aa79f6ba5e32e39835c
MD5 40f7b50644e6c0082b81130c287cc97a
BLAKE2b-256 24af8d6fb07819ec8846a1188465edd4a0225fb54f83bb855eab017b2807a064

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbs_vector-0.5.3-py3-none-any.whl:

Publisher: release.yml on dbsmedya/dbs-vector

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page