RAG MCP server: ChromaDB + sentence-transformers, exposes ingest/search/list/delete tools.
Project description
Agent_rag
Agent_rag is a RAG (Retrieval-Augmented Generation) MCP (Model Context Protocol) Server. It uses ChromaDB for vector storage and provides flexible embedding options—from lightweight Ollama integration to local ONNX models—with zero external dependencies.
Available Tools
This server exposes several MCP tools for the orchestrator (Agent_head) or any other MCP client:
rag_ingest— Ingest documents, directories, or raw text into a collectionrag_search— Perform semantic search against your knowledge baserag_list_collections— List all active collectionsrag_delete_collection— Delete a specific collection
Features
- Flexible Embedding Providers — Choose your embedding backend:
- Ollama (~50 MB): Lightweight, runs with Ollama service—94% smaller than torch-based setup
- Local ONNX (~300 MB): Fully offline with CPU/GPU support—no external services needed
- GPU-Accelerated (~1.2 GB): PyTorch + sentence-transformers for maximum performance
- ChromaDB Integration: Persistent vector database for efficient semantic search
- FastMCP Built-in: Asynchronous, thread-safe tool execution
- Easy Configuration: Flexible
config.yamlfor chunk size, collections, embedding models, and provider selection - Zero External Dependencies: No API keys required—everything runs locally
Quick Start
For a complete installation guide with size comparisons, troubleshooting, and provider-specific setup:
👉 See INSTALLATION.md
Installation & Usage
Interactive Setup (Recommended)
cd Agent_rag
python setup_agent_rag.py
This interactive script guides you through provider selection and runs the appropriate installation command.
Direct Installation
For Ollama (lightweight, ~50 MB):
uv install .[ollama]
python server.py
For Local Offline (ONNX, ~300 MB):
uv install .[local]
python server.py
For GPU (PyTorch acceleration, ~1.2 GB):
uv install .[gpu]
python server.py
For All Providers (complete setup):
uv install .[all]
python server.py
Running with uvx
You can run the published MCP server directly. uvx will automatically download and run the latest version:
uvx agent-rag-mcp
Transport Modes
By default, the server runs in stdio transport mode (designed to be spawned as a subprocess by MCP clients like Agent_head).
To run it over HTTP using Server-Sent Events (SSE):
uvx agent-rag-mcp --transport sse --port 8002 --host 0.0.0.0
Specifying a Test Registry (If using TestPyPI)
If you published the package to TestPyPI instead of the main PyPI, run it via:
uvx --extra-index-url https://test.pypi.org/simple/ --index-strategy unsafe-best-match agent-rag-mcp@latest
Configuration
The embedding provider and collection behavior are configured in config.yaml:
embeddings:
provider: "ollama" # Options: "ollama", "onnx"
ollama_base_url: "http://localhost:11434" # Ollama service URL
ollama_model: "nomic-embed-text" # Ollama embedding model
# For ONNX provider:
# model: "all-MiniLM-L6-v2"
# device: "cpu" # Options: "cpu", "cuda"
database:
persist_directory: "./chroma_db"
document_processing:
chunk_size: 1000
chunk_overlap: 200
For detailed configuration options and provider-specific settings, see INSTALLATION.md.
Integrating with Agent_head
To connect this RAG server to your Agent_head orchestrator, add the following configuration to Agent_head/config.yaml:
memory:
enabled: true
backend: "rag"
# Configure this if backend is set to "rag"
rag_server:
command: "uvx"
args: ["agent-rag-mcp"] # Or ["--from", "/path/to/local/Agent_rag", "agent-rag-mcp"] for local development
collection: "agent_memory"
Local Development
If you are developing this package locally:
- Install dependencies:
uv sync - Run locally:
uv run agent-rag-mcp
- Test the server:
python test_mcp_client.py - Build the package:
uv build
Architecture
Agent_rag uses a modular provider system:
- Embeddings Layer — Pluggable providers (Ollama, ONNX, future extensibility)
- ChromaDB — Local, persistent vector database with SQLite backend
- MCP Server — FastMCP-based async tool execution and stdio/SSE transport
- Document Pipeline — Configurable chunking, ingestion, and collection management
For more details, see INSTALLATION.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agent_rag_mcp-1.1.8.tar.gz.
File metadata
- Download URL: agent_rag_mcp-1.1.8.tar.gz
- Upload date:
- Size: 20.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
22c0c436a19ce5f559934ea141444b25b80344b0a6bf4d5d21678d5d1281b44a
|
|
| MD5 |
774895dc822895388e126995f68435f3
|
|
| BLAKE2b-256 |
f82265d5bbcb7ef77b54af0fa3b918be66be64642e0729d2af2a8bd442ae54c7
|
File details
Details for the file agent_rag_mcp-1.1.8-py3-none-any.whl.
File metadata
- Download URL: agent_rag_mcp-1.1.8-py3-none-any.whl
- Upload date:
- Size: 21.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6ac792e87cee2295d5689db71ba1d14eb02a71b3891aa2c311464c4a517541d3
|
|
| MD5 |
f7af07368e3ea8d095440bf2f2a84716
|
|
| BLAKE2b-256 |
01e4f7b850dadb89fa016ee7376d55539ccc8688bb87aa35a6380b78ad586190
|