Skip to main content

This is a local rag-mcp solution with chromadb using langchain and docling

Project description

RAG MCP: Document Processing Server

A Retrieval-Augmented Generation (RAG) server built on the Model Context Protocol (MCP) for intelligent document processing and question answering.

Overview

RAG MCP is a tool that allows you to index various document formats and perform semantic searches against them. It uses advanced embedding techniques and vector databases to make your documents searchable through natural language queries.

Features

  • Document Indexing: Support for various document formats (PDF, DOCX, XLSX, PPTX, Markdown, AsciiDoc, HTML, XHTML, CSV)
  • Semantic Search: Query your documents using natural language
  • Flexible Embedding Models: Choose between HuggingFace BGE (default) or Ollama embeddings.
  • High Performance: Optimized for various hardware configurations with automatic device selection (CUDA, MPS, CPU) for HuggingFace embeddings.
  • Persistent Storage: Vector embeddings are stored locally for future use

Requirements

  • Python 3.11+
  • Environment with access to your documents
  • (Optional) Ollama installed and running if using Ollama embeddings.

Installation

1. Install UV

First, you need to install UV, a Python package installer and resolver:

On macOS/Linux:

curl -sSf https://astral.sh/uv/install.sh | sh

On Windows:

powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

2. Run RAG MCP

Once UV is installed, you can run RAG MCP directly using:

uvx rag-mcp

This will start the MCP server and make it available for document processing.

Environment Variables:

You can configure RAG MCP using environment variables:

  • PERSIST_DIRECTORY (Required): Path to the directory where the vector database will be stored (e.g., /path/to/your/persist/directory). A chromadb subfolder will be created here.
  • USE_OLLAMA_EMBEDDING (Optional): Set to True to use Ollama embeddings instead of the default HuggingFace BGE embeddings. Requires Ollama to be running. Defaults to False.
  • OLLAMA_EMBEDDING_MODEL (Optional): Specifies the Ollama model name to use for embeddings (e.g., nomic-embed-text). Only used if USE_OLLAMA_EMBEDDING=True. Defaults to bge-m3:latest.

Example:

PERSIST_DIRECTORY=/data/rag_db USE_OLLAMA_EMBEDDING=True OLLAMA_EMBEDDING_MODEL=nomic-embed-text uvx rag-mcp

IDE Integration

VS Code Integration

To integrate with Visual Studio Code, create a mcp.json file with the following content:

{
    "servers": {
        "rag-mcp-server": {
            "type": "stdio",
            "command": "uvx",
            "args": [
                "rag-mcp"
            ],
            "env": {
                "PERSIST_DIRECTORY": "/path/to/your/persist/directory",
                // Optional: Uncomment and set to true to use Ollama
                // "USE_OLLAMA_EMBEDDING": "True",
                // Optional: Specify Ollama model if using Ollama
                // "OLLAMA_EMBEDDING_MODEL": "nomic-embed-text"
            }
        }
    }
}

Replace /path/to/your/persist/directory with the desired path. Adjust Ollama settings as needed.

Cursor Integration

To integrate with Cursor, go to Cursor Settings > MCP and paste this configuration:

{
    "mcpServers": {
      "rag-mcp": {
        "command": "uvx",
        "args": ["rag-mcp"],
        "env": {
          "PERSIST_DIRECTORY": "/path/to/your/persist/directory",
          // Optional: Uncomment and set to true to use Ollama
          // "USE_OLLAMA_EMBEDDING": "True",
          // Optional: Specify Ollama model if using Ollama
          // "OLLAMA_EMBEDDING_MODEL": "nomic-embed-text"
        }
      }
    }
  }

Replace /path/to/your/persist/directory with the desired path. Adjust Ollama settings as needed.

Usage

Supported Embedding Models

The system supports the following embedding models:

  1. HuggingFace BGE Embeddings (default): High-quality embeddings that work offline. Optimized for available hardware (CUDA, MPS, CPU).
    • Uses BAAI/bge-m3 model.
  2. Ollama Embeddings (optional): Uses embeddings generated by a running Ollama instance. Enable by setting USE_OLLAMA_EMBEDDING=True.
    • Default model: bge-m3:latest.
    • Specify a different model using OLLAMA_EMBEDDING_MODEL.

How It Works

  1. Document Loading: Uses DoclingLoader to parse various document formats
  2. Text Splitting: Documents are split into manageable chunks using RecursiveCharacterTextSplitter
  3. Embedding Generation: Text chunks are converted to vector embeddings using either HuggingFace BGE or Ollama.
  4. Storage: Embeddings are stored in a Chroma vector database
  5. Retrieval: When queried, the system finds semantically similar content to answer questions

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rag_mcp-0.1.4.tar.gz (5.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rag_mcp-0.1.4-py3-none-any.whl (6.2 kB view details)

Uploaded Python 3

File details

Details for the file rag_mcp-0.1.4.tar.gz.

File metadata

  • Download URL: rag_mcp-0.1.4.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.4

File hashes

Hashes for rag_mcp-0.1.4.tar.gz
Algorithm Hash digest
SHA256 9cd96e7023b9333ff1fbf2136e3b222bdfe45eb42e1c1d46714526c3830811f8
MD5 4a1ee4e641a7fc765deb04cee523c207
BLAKE2b-256 dbb1b5090eb72343e51ed8dd656af880be7ddd26c16dfd5000df5f3d912b0076

See more details on using hashes here.

File details

Details for the file rag_mcp-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: rag_mcp-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 6.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.4

File hashes

Hashes for rag_mcp-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 95035a456fbfb4e36d38fb103a4ca3ebd6fd4c5c8de9c3d4e05cb04a06146149
MD5 fc218691ebf5dbba2b6726aa6e3f219f
BLAKE2b-256 58e12548269854714e2e8cef6e689693300e5a0942388c2ccc51b5e263e4ba0e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page