MCP server for MinerU - PDF parsing with MLX acceleration on Apple Silicon
Project description
🚀 MCP-MinerU
A Model Context Protocol (MCP) server that brings powerful PDF parsing capabilities to Claude using MinerU.
✨ Features
- 📄 Parse PDF files with high accuracy
- 🧮 Extract formulas and mathematical equations
- 📊 Recognize tables and preserve structure
- ⚡️ MLX acceleration on Apple Silicon (M1/M2/M3/M4)
- 🔄 Multiple backends for different use cases
- 🤖 MCP integration for seamless use with Claude
🎯 Tools
parse_pdf
Parse PDF files and extract structured content as Markdown.
Parameters:
file_path(required): Absolute path to the PDF filebackend(optional):pipeline|vlm-mlx-engine|vlm-transformersformula_enable(optional): Enable formula recognition (default: true)table_enable(optional): Enable table recognition (default: true)start_page(optional): Starting page number (default: 0)end_page(optional): Ending page number (default: -1 for all pages)
list_backends
Check system capabilities and get backend recommendations.
🛠️ Installation
Prerequisites
- Python 3.10-3.13
- uv (recommended) or pip
Quick Install
# Clone the repository
git clone https://github.com/TINKPA/mcp-mineru.git
cd mcp-mineru
# Install with all dependencies (one command!)
pip install -e .
That's it! The mineru[core] dependency will automatically install all backends (pipeline, vlm, mlx).
🔧 Configuration
Claude Code (Recommended)
Use the Claude Code CLI to add the server directly:
# Replace /absolute/path/to/mcp-mineru with your actual path
# Using --scope user makes it available across all your projects
claude mcp add --transport stdio --scope user mineru -- \
python /absolute/path/to/mcp-mineru/src/mcp_mineru/server.py
Or using uv:
claude mcp add --transport stdio --scope user mineru -- \
uv --directory /absolute/path/to/mcp-mineru run python src/mcp_mineru/server.py
Configuration Scope Options:
--scope user(recommended): Available across all your projects--scope local: Available only in the current project (default)--scope project: Shared with everyone via.mcp.jsonfile
Note: The -- (double dash) separates Claude's CLI flags from the command that runs the MCP server. Everything after -- is the actual command to execute.
Claude Desktop (Manual Configuration)
Add to your Claude Desktop configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"mineru": {
"command": "python",
"args": [
"/absolute/path/to/mcp-mineru/src/mcp_mineru/server.py"
]
}
}
}
Or using uv (recommended):
{
"mcpServers": {
"mineru": {
"command": "uv",
"args": [
"--directory",
"/absolute/path/to/mcp-mineru",
"run",
"python",
"src/mcp_mineru/server.py"
]
}
}
}
📖 Usage Examples
Example 1: Parse a PDF
User: "Please analyze this research paper: /path/to/paper.pdf"
Claude: [Calls parse_pdf tool]
"This research paper discusses... The key findings in Table 3 show..."
Example 2: Check system capabilities
User: "What's the best backend for my system?"
Claude: [Calls list_backends tool]
"Your system has Apple Silicon (M4). I recommend using the
'vlm-mlx-engine' backend for fastest performance."
Example 3: Extract specific pages
User: "Extract pages 10-15 from this PDF"
Claude: [Calls parse_pdf with start_page=9, end_page=14]
"Here's the content from pages 10-15..."
🏗️ Development
Run tests
pytest
Format code
black src/
ruff check src/
❓ Troubleshooting
ModuleNotFoundError when running tests
If you see errors like ModuleNotFoundError: No module named 'mineru' or 'torch':
Solution: Reinstall the package to ensure all dependencies are installed:
pip install -e .
The mineru[core] dependency should automatically install all required backends.
🚀 Performance
On Apple Silicon (M4):
- pipeline backend: ~32 seconds/page
- vlm-mlx-engine backend: ~38 seconds/page (higher quality)
- vlm-transformers backend: ~148 seconds/page
Benchmarked on a Mac mini M4 with 16GB RAM
📝 License
This project uses MinerU as a submodule, which is licensed under the Apache License 2.0.
🙏 Dependencies & Acknowledgments
This project is built on top of:
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mcp_mineru-0.1.0.tar.gz.
File metadata
- Download URL: mcp_mineru-0.1.0.tar.gz
- Upload date:
- Size: 8.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eec121b193e2f9248c512caa3015551aaa908b42037e6fa31012604617a00b8f
|
|
| MD5 |
e144f3c0c8060f61ddb4eb19ea5cf1af
|
|
| BLAKE2b-256 |
f220cb3f2abcee0174ad544f0bb2ff3f2a29743e97b1a2841fd1d8172c11b895
|
File details
Details for the file mcp_mineru-0.1.0-py3-none-any.whl.
File metadata
- Download URL: mcp_mineru-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f5ea25f5d9d7a39fd8859290cdcc9b61f4d03f32751e25fd2b0347eb767b357b
|
|
| MD5 |
12a88384b54a00a67b06f6156cf57639
|
|
| BLAKE2b-256 |
c3b3aadeea4a560a32abb1a0743641dcd9673cb05ffea9b7b59ecaf568c03b00
|