Skip to main content

RAG MCP server: ChromaDB + sentence-transformers, exposes ingest/search/list/delete tools.

Project description

Agent_rag

Agent_rag is a RAG (Retrieval-Augmented Generation) MCP (Model Context Protocol) Server. It uses ChromaDB for vector storage and provides flexible embedding options—from lightweight Ollama integration to local ONNX models—with zero external dependencies.

Available Tools

This server exposes several MCP tools for the orchestrator (Agent_head) or any other MCP client:

  • rag_ingest — Ingest documents, directories, or raw text into a collection
  • rag_search — Perform semantic search against your knowledge base
  • rag_list_collections — List all active collections
  • rag_delete_collection — Delete a specific collection

Features

  • Flexible Embedding Providers — Choose your embedding backend:
    • Ollama (~50 MB): Lightweight, runs with Ollama service—94% smaller than torch-based setup
    • Local ONNX (~300 MB): Fully offline with CPU/GPU support—no external services needed
    • GPU-Accelerated (~1.2 GB): PyTorch + sentence-transformers for maximum performance
  • ChromaDB Integration: Persistent vector database for efficient semantic search
  • FastMCP Built-in: Asynchronous, thread-safe tool execution
  • Easy Configuration: Flexible config.yaml for chunk size, collections, embedding models, and provider selection
  • Zero External Dependencies: No API keys required—everything runs locally

Quick Start

For a complete installation guide with size comparisons, troubleshooting, and provider-specific setup:

👉 See INSTALLATION.md

Installation & Usage

Interactive Setup (Recommended)

cd Agent_rag
python setup_agent_rag.py

This interactive script guides you through provider selection and runs the appropriate installation command.

Direct Installation

For Ollama (lightweight, ~50 MB):

uv install .[ollama]
python server.py

For Local Offline (ONNX, ~300 MB):

uv install .[local]
python server.py

For GPU (PyTorch acceleration, ~1.2 GB):

uv install .[gpu]
python server.py

For All Providers (complete setup):

uv install .[all]
python server.py

Running with uvx

You can run the published MCP server directly. uvx will automatically download and run the latest version:

uvx agent-rag-mcp

Transport Modes

By default, the server runs in stdio transport mode (designed to be spawned as a subprocess by MCP clients like Agent_head).

To run it over HTTP using Server-Sent Events (SSE):

uvx agent-rag-mcp --transport sse --port 8002 --host 0.0.0.0

Specifying a Test Registry (If using TestPyPI)

If you published the package to TestPyPI instead of the main PyPI, run it via:

uvx --extra-index-url https://test.pypi.org/simple/ --index-strategy unsafe-best-match agent-rag-mcp@latest

Configuration

The embedding provider and collection behavior are configured in config.yaml:

embeddings:
  provider: "ollama" # Options: "ollama", "onnx"
  ollama_base_url: "http://localhost:11434" # Ollama service URL
  ollama_model: "nomic-embed-text" # Ollama embedding model

  # For ONNX provider:
  # model: "all-MiniLM-L6-v2"
  # device: "cpu"                 # Options: "cpu", "cuda"

database:
  persist_directory: "./chroma_db"

document_processing:
  chunk_size: 1000
  chunk_overlap: 200

For detailed configuration options and provider-specific settings, see INSTALLATION.md.

Integrating with Agent_head

To connect this RAG server to your Agent_head orchestrator, add the following configuration to Agent_head/config.yaml:

memory:
  enabled: true
  backend: "rag"

  # Configure this if backend is set to "rag"
  rag_server:
    command: "uvx"
    args: ["agent-rag-mcp"] # Or ["--from", "/path/to/local/Agent_rag", "agent-rag-mcp"] for local development
    collection: "agent_memory"

Local Development

If you are developing this package locally:

  1. Install dependencies:
    uv sync
    
  2. Run locally:
    uv run agent-rag-mcp
    
  3. Test the server:
    python test_mcp_client.py
    
  4. Build the package:
    uv build
    

Architecture

Agent_rag uses a modular provider system:

  • Embeddings Layer — Pluggable providers (Ollama, ONNX, future extensibility)
  • ChromaDB — Local, persistent vector database with SQLite backend
  • MCP Server — FastMCP-based async tool execution and stdio/SSE transport
  • Document Pipeline — Configurable chunking, ingestion, and collection management

For more details, see INSTALLATION.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_rag_mcp-1.1.8.tar.gz (20.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agent_rag_mcp-1.1.8-py3-none-any.whl (21.2 kB view details)

Uploaded Python 3

File details

Details for the file agent_rag_mcp-1.1.8.tar.gz.

File metadata

  • Download URL: agent_rag_mcp-1.1.8.tar.gz
  • Upload date:
  • Size: 20.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for agent_rag_mcp-1.1.8.tar.gz
Algorithm Hash digest
SHA256 22c0c436a19ce5f559934ea141444b25b80344b0a6bf4d5d21678d5d1281b44a
MD5 774895dc822895388e126995f68435f3
BLAKE2b-256 f82265d5bbcb7ef77b54af0fa3b918be66be64642e0729d2af2a8bd442ae54c7

See more details on using hashes here.

File details

Details for the file agent_rag_mcp-1.1.8-py3-none-any.whl.

File metadata

  • Download URL: agent_rag_mcp-1.1.8-py3-none-any.whl
  • Upload date:
  • Size: 21.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for agent_rag_mcp-1.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 6ac792e87cee2295d5689db71ba1d14eb02a71b3891aa2c311464c4a517541d3
MD5 f7af07368e3ea8d095440bf2f2a84716
BLAKE2b-256 01e4f7b850dadb89fa016ee7376d55539ccc8688bb87aa35a6380b78ad586190

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page