Skip to main content

MinerU MCP Server for PDF to Markdown conversion

Project description

MinerU Open MCP

An Official Mineru MCP server that exposes MinerU's document parsing as MCP tools. Connect any MCP-compatible AI client to convert PDFs, Word docs, PowerPoint files, and images into Markdown.

No API key required — Flash mode works out of the box, free with no sign-up, for files up to 20 pages / 10 MB. Set MINERU_API_TOKEN to unlock higher limits and extra output formats.


⚡ Quickest Way to Run — uvx (no install needed)

mineru-open-mcp is on PyPI. With uv installed, you can run it directly — no separate install step.

Configure your MCP client

stdio — Claude Desktop, Cursor, Windsurf

The MCP client launches mineru-open-mcp as a subprocess automatically.

Using uvx (recommended — always runs the latest version):

{
  "mcpServers": {
    "mineru": {
      "command": "uvx",
      "args": ["mineru-open-mcp"],
      "env": {
        "MINERU_API_TOKEN": "your_key_here"
      }
    }
  }
}

No API key? The server runs in Flash mode — free, markdown-only, learn more at Flash Mode Docs

mineru-open-mcp not on PATH? Use the full path: "/Users/you/.local/bin/mineru-open-mcp", or use the uvx approach above which handles this automatically.

Usage Examples

Example 1: Parse a local PDF document with target page ranges

User prompt: "Parse the 3rd-5th pages of this PDF into markdown: <your_path_to_file>" What happens:

  • MinerU uploads and parses the PDF
  • Returns clean Markdown with tables (HTML) and formulas (Latex) preserved
  • Returns markdown texts in the chat if length permitted along with the output path, and the zip url if you prefer
  • MCP client summarizes the content

Example 2: Parse a remote url hosting a file

User prompt: "Extract contents from this paper: https://arxiv.org/pdf/2509.22186" What happens:

  • MinerU parses the paper into markdown
  • MCP client formats and explains the tables

Example 3: Parse local PDF files with independent page ranges

User prompt: "Parse <file1> page 1-5, <file2> page 2-9, <file3> page 3 into markdown" What happens:

  • MinerU uploads and parses the files separatedly
  • Returns target format ouputs, the zip url for you to download, markdown abstract, the directory you want to save the output to
  • MCP client uses the content for further analysis

Example 4: Advanced custom preferences

User prompt1: "use pipeline model to parse this Korean file your_path_here" User prompt2: "parse your_path_here and save the markdown to your_output_dir" What happends:

  • Pipeline model is another model provided by MinerU service (BTW, vlm model is the default choice)
  • You are allowed to specify a model, an ocr language, or even an independent output dir different from OUTPUT_DIR by structuring your prompt
  • Your requests are parameterized into parse_documents tool and MinerU will handle the rest.

streamable-http — web-based MCP clients

Start the server manually, then point your client at it:

MINERU_API_TOKEN=your_key mineru-open-mcp --transport streamable-http --port 8001
{
  "mcpServers": {
    "mineru": {
      "type": "streamableHttp",
      "url": "http://127.0.0.1:8001/mcp"
    }
  }
}

Features

  • parse_documents ? convert local files and/or remote URLs to Markdown; Input supports PDF, images(png/jpg/jpeg/jp2/webp/gif/bmp, Doc, Docx, Ppt, PPTx. Flash Mode also supports xlsx.
  • get_ocr_languages — list all OCR languages supported by MinerU
  • Flash mode — works without an API key (free, markdown output only, supports PDF/images/Docx/PPTx/xls/xlsx); For full features, please provide MINERU_API_TOKEN, which will disable flash mode.
  • Output behavior ? single-file parses return inline Markdown by default; batch parses save results to disk and return file metadata. Oversized inline content is also saved locally and returned via extract_path.
  • Two transport modes ? stdio, streamable-http

Environment Variables

Variable Description Default
MINERU_API_TOKEN MinerU API token, apply on MinerU for full capability. If not provided, flash mode is enabled.
OUTPUT_DIR Directory used when parsed results need to be saved locally, such as batch parsing or oversized inline content ~/mineru-downloads

Privacy Policy

mineru-open-mcp connects to the official MinerU API (mineru.net) to parse documents.

  • Data sent: Document content (files or URLs you provide for parsing)
  • Data storage: Parsed results are temporarily cached by MinerU servers; not used for training
  • Third-party: MinerU API (mineru.net) — see MinerU Privacy Policy
  • Local data: Parsed results will be saved to target output directory. Log files (only when ENABLE_LOG=true), saved to MINERU_LOG_DIR;
  • Contact: OpenDataLab@pjlab.org.cn (or raise an issue at MinerU-Ecosystem )

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mineru_open_mcp-1.0.21.tar.gz (17.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mineru_open_mcp-1.0.21-py3-none-any.whl (17.1 kB view details)

Uploaded Python 3

File details

Details for the file mineru_open_mcp-1.0.21.tar.gz.

File metadata

  • Download URL: mineru_open_mcp-1.0.21.tar.gz
  • Upload date:
  • Size: 17.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for mineru_open_mcp-1.0.21.tar.gz
Algorithm Hash digest
SHA256 be472ae59790f48e8e9e042605b0527d9460fa333ff11923330259a67d714f82
MD5 43aca5e295918115b196e3e48437e895
BLAKE2b-256 36875ec1c1c931d5628afbc4d87c86f34cc9efc50b4f8a36265f4764df743fc8

See more details on using hashes here.

File details

Details for the file mineru_open_mcp-1.0.21-py3-none-any.whl.

File metadata

File hashes

Hashes for mineru_open_mcp-1.0.21-py3-none-any.whl
Algorithm Hash digest
SHA256 cdc67ef58ef225ba31138c07dc1941c8a67fbf5cb8f60e50a3f8fdd347da1edc
MD5 6b21a3ae4c1312e0f2295763c6b56fcf
BLAKE2b-256 974dcd51e68c9f6064e4adb7464e71e0449a1d69014e9d444e4d8ee145cc43bd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page