Skip to main content

RAG MCP server: ChromaDB + sentence-transformers, exposes ingest/search/list/delete tools.

Project description

Agent_rag

Agent_rag is a RAG (Retrieval-Augmented Generation) MCP (Model Context Protocol) Server. It uses ChromaDB for vector storage and provides flexible embedding options—from lightweight Ollama integration to local ONNX models—with zero external dependencies.

Available Tools

This server exposes several MCP tools for the orchestrator (Agent_head) or any other MCP client:

  • rag_ingest — Ingest documents, directories, or raw text into a collection
  • rag_search — Perform semantic search against your knowledge base
  • rag_list_collections — List all active collections
  • rag_delete_collection — Delete a specific collection

Features

  • Flexible Embedding Providers — Choose your embedding backend:
    • Ollama (~50 MB): Lightweight, runs with Ollama service—94% smaller than torch-based setup
    • Local ONNX (~300 MB): Fully offline with CPU/GPU support—no external services needed
    • GPU-Accelerated (~1.2 GB): PyTorch + sentence-transformers for maximum performance
  • ChromaDB Integration: Persistent vector database for efficient semantic search
  • FastMCP Built-in: Asynchronous, thread-safe tool execution
  • Easy Configuration: Flexible config.yaml for chunk size, collections, embedding models, and provider selection
  • Zero External Dependencies: No API keys required—everything runs locally

Quick Start

For a complete installation guide with size comparisons, troubleshooting, and provider-specific setup:

👉 See INSTALLATION.md

Installation & Usage

Interactive Setup (Recommended)

cd Agent_rag
python setup_agent_rag.py

This interactive script guides you through provider selection and runs the appropriate installation command.

Direct Installation

For Ollama (lightweight, ~50 MB):

uv install .[ollama]
python server.py

For Local Offline (ONNX, ~300 MB):

uv install .[local]
python server.py

For GPU (PyTorch acceleration, ~1.2 GB):

uv install .[gpu]
python server.py

For All Providers (complete setup):

uv install .[all]
python server.py

Running with uvx

You can run the published MCP server directly. uvx will automatically download and run the latest version:

uvx agent-rag-mcp

Transport Modes

By default, the server runs in stdio transport mode (designed to be spawned as a subprocess by MCP clients like Agent_head).

To run it over HTTP using Server-Sent Events (SSE):

uvx agent-rag-mcp --transport sse --port 8002 --host 0.0.0.0

Specifying a Test Registry (If using TestPyPI)

If you published the package to TestPyPI instead of the main PyPI, run it via:

uvx --extra-index-url https://test.pypi.org/simple/ --index-strategy unsafe-best-match agent-rag-mcp@latest

Configuration

The embedding provider and collection behavior are configured in config.yaml:

embeddings:
  provider: "ollama" # Options: "ollama", "onnx"
  ollama_base_url: "http://localhost:11434" # Ollama service URL
  ollama_model: "nomic-embed-text" # Ollama embedding model

  # For ONNX provider:
  # model: "all-MiniLM-L6-v2"
  # device: "cpu"                 # Options: "cpu", "cuda"

database:
  persist_directory: "./chroma_db"

document_processing:
  chunk_size: 1000
  chunk_overlap: 200

For detailed configuration options and provider-specific settings, see INSTALLATION.md.

Integrating with Agent_head

To connect this RAG server to your Agent_head orchestrator, add the following configuration to Agent_head/config.yaml:

memory:
  enabled: true
  backend: "rag"

  # Configure this if backend is set to "rag"
  rag_server:
    command: "uvx"
    args: ["agent-rag-mcp"] # Or ["--from", "/path/to/local/Agent_rag", "agent-rag-mcp"] for local development
    collection: "agent_memory"

Local Development

If you are developing this package locally:

  1. Install dependencies:
    uv sync
    
  2. Run locally:
    uv run agent-rag-mcp
    
  3. Test the server:
    python test_mcp_client.py
    
  4. Build the package:
    uv build
    

Architecture

Agent_rag uses a modular provider system:

  • Embeddings Layer — Pluggable providers (Ollama, ONNX, future extensibility)
  • ChromaDB — Local, persistent vector database with SQLite backend
  • MCP Server — FastMCP-based async tool execution and stdio/SSE transport
  • Document Pipeline — Configurable chunking, ingestion, and collection management

For more details, see INSTALLATION.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_rag_mcp-1.1.7.tar.gz (20.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agent_rag_mcp-1.1.7-py3-none-any.whl (21.1 kB view details)

Uploaded Python 3

File details

Details for the file agent_rag_mcp-1.1.7.tar.gz.

File metadata

  • Download URL: agent_rag_mcp-1.1.7.tar.gz
  • Upload date:
  • Size: 20.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for agent_rag_mcp-1.1.7.tar.gz
Algorithm Hash digest
SHA256 d926fdac26310b647d235e8ef3490b250d905f5b762a22786cae3a91f4e52f70
MD5 a532d55f15ab876feef5ea641dda00eb
BLAKE2b-256 516b4c2d8d1748d27d11861872030ee4aa86dd02a1bdb2a3830274f0e12e41d8

See more details on using hashes here.

File details

Details for the file agent_rag_mcp-1.1.7-py3-none-any.whl.

File metadata

  • Download URL: agent_rag_mcp-1.1.7-py3-none-any.whl
  • Upload date:
  • Size: 21.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for agent_rag_mcp-1.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 a992122e3300109e9a35cf35c0967ec9adada9ddcc1a1976ce979924e89b5aa6
MD5 ad4dca7a591bb4db7b8723c17ebf6621
BLAKE2b-256 152de02f7d99faffdb3b4865180c351a6f7eb5ab78ef75196cf57baf39d510c8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page