Skip to main content

MCP server for OCR with multiple backends (Marker, DeepSeek, Mistral)

Project description

OCR MCP Server

Multi-backend OCR MCP server for LibreChat with support for Marker (local), DeepSeek API, and Mistral API.

Features

  • Multiple Backends: Marker (local, free), DeepSeek API, Mistral API
  • Automatic Fallback: Seamlessly falls back to next backend on failure
  • Flexible Configuration: Environment-based configuration
  • Cost-Effective: Use local Marker for simple docs, APIs for complex handwriting
  • Easy Integration: Works with LibreChat MCP
  • Optional Backends: Enable only the backends you need

Quick Start

Minimal Setup (Marker + Mistral)

# Install
pip install -r requirements.txt

# Configure (only Mistral key required)
echo 'MISTRAL_API_KEY=your-key-here' > .env

# Test
python -m ocr_mcp.server

LibreChat Configuration

mcpServers:
  ocr:
    command: "python"
    args: ["-m", "ocr_mcp.server"]
    env:
      MISTRAL_API_KEY: "your-mistral-key-here"
      ENABLED_BACKENDS: "marker,mistral"
      DEFAULT_BACKEND: "marker"

Configuration

Environment Variables

# API Keys (only required for enabled backends)
MISTRAL_API_KEY=your-mistral-key-here      # Required if mistral enabled
DEEPSEEK_API_KEY=your-deepseek-key-here    # Optional - only if deepseek enabled

# Backend Configuration
ENABLED_BACKENDS=marker,mistral            # Comma-separated list
DEFAULT_BACKEND=marker                     # Default backend to use

# Processing Settings
MAX_FILE_SIZE_MB=50                        # Maximum file size
TIMEOUT_SECONDS=120                        # Processing timeout
API_TIMEOUT=30                             # API call timeout
API_MAX_RETRIES=3                          # API retry attempts
MARKER_BATCH_SIZE=1                        # Marker batch size

Backend Options

Marker Only (No API keys needed):

ENABLED_BACKENDS: "marker"
DEFAULT_BACKEND: "marker"

Marker + Mistral (Recommended):

MISTRAL_API_KEY: "your-key"
ENABLED_BACKENDS: "marker,mistral"
DEFAULT_BACKEND: "marker"

All Backends (When you have DeepSeek key):

MISTRAL_API_KEY: "your-mistral-key"
DEEPSEEK_API_KEY: "your-deepseek-key"
ENABLED_BACKENDS: "marker,deepseek,mistral"
DEFAULT_BACKEND: "marker"

Usage

In LibreChat Assistant

Auto-select backend:

Extract text from this PDF

Specify backend:

Use Mistral to extract text from this handwritten page

Batch processing:

Process all uploaded PDFs with Marker first, use Mistral for pages with tables

Available Tools

  • ocr: Extract text from PDF files or images
    • file_path (required): Path to the file
    • backend (optional): Specific backend to use (marker, deepseek, mistral)

Backend Details

Marker (Local)

  • Pros: Free, fast, CPU-only, no API costs
  • Best for: Text-heavy documents, clean PDFs
  • Limitations: Struggles with handwriting and complex layouts
  • Requirements: No API key needed

Mistral API (Pixtral)

  • Pros: Best OCR accuracy, excellent with complex layouts
  • Best for: Tables, forms, mixed content, handwriting
  • Cost: API usage based
  • Requirements: Valid Mistral API key

DeepSeek API

  • Pros: Excellent handwriting recognition, affordable
  • Best for: Handwritten notes, marked-up documents
  • Cost: API usage based
  • Requirements: Valid DeepSeek API key (optional)

Adding DeepSeek Later

When you're ready to enable DeepSeek:

  1. Get API key from: https://platform.deepseek.com/api_keys
  2. Add to your configuration:
    DEEPSEEK_API_KEY: "your-deepseek-key-here"
    ENABLED_BACKENDS: "marker,deepseek,mistral"
    
  3. Restart LibreChat

The server will automatically detect and use DeepSeek when available.

Troubleshooting

MCP server not appearing

  • Check librechat.yaml indentation (YAML is whitespace-sensitive)
  • Verify API keys are set correctly
  • Check Docker logs: docker-compose logs -f

"Backend not available" error

  • Ensure ENABLED_BACKENDS matches your available API keys
  • DeepSeek will be skipped if no API key is provided
  • Mistral requires a valid API key if enabled
  • Marker requires no extra setup

Slow processing

  • Marker runs locally (fast for simple docs)
  • API calls take 2-5 seconds per page
  • Consider batch processing for large books

Marker installation issues

# Install marker-pdf separately
pip install marker-pdf

# Or install all dependencies
pip install -r requirements.txt

Configuration validation errors

The server will validate your configuration on startup. Common errors:

  • "DeepSeek backend is enabled but DEEPSEEK_API_KEY is not set"

    • Either add the API key, or remove 'deepseek' from ENABLED_BACKENDS
  • "Mistral backend is enabled but MISTRAL_API_KEY is not set"

    • Either add the API key, or remove 'mistral' from ENABLED_BACKENDS

Development

Project Structure

ocr-mcp/
├── ocr_mcp/
│   ├── __init__.py
│   ├── server.py          # Main MCP server
│   ├── config.py          # Configuration management
│   └── backends/
│       ├── __init__.py
│       ├── base.py        # Base backend interface
│       ├── marker.py      # Marker backend
│       ├── deepseek.py    # DeepSeek API backend
│       └── mistral.py     # Mistral API backend
├── pyproject.toml
└── README.md

Running from source

# Install in development mode
pip install -e .

# Run server
python -m ocr_mcp.server

License

MIT License

Contributing

Contributions welcome! Please feel free to submit issues or pull requests.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ocr_mcp-0.1.0.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ocr_mcp-0.1.0-py3-none-any.whl (12.5 kB view details)

Uploaded Python 3

File details

Details for the file ocr_mcp-0.1.0.tar.gz.

File metadata

  • Download URL: ocr_mcp-0.1.0.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for ocr_mcp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 bf307900ab7df109fd266c995ebdf3ac0d9ff95049dbabced1585b41de77e605
MD5 359ebe2d29ad02fe884ae113838d99b7
BLAKE2b-256 a0d9539e23d4d2349a118d0177ef55bc8a83eafc96667c7be8889291b997e851

See more details on using hashes here.

File details

Details for the file ocr_mcp-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ocr_mcp-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for ocr_mcp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5e2020d35a1eab23b5fbbdb19e5ba2d2d3229cf2dfc7b58ec44e943039cc4995
MD5 7f81a2babd6a01cde74be02f237f5e16
BLAKE2b-256 96563298f7adcb830b60a1a661426eaedf8c6135a368664255358a8d8113628b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page