Skip to main content

MCP server for GLM OCR to extract text from images and PDFs

Project description

GLM OCR MCP Server

MCP server for extracting text from images and PDFs using ZhipuAI GLM-OCR.

Usage

{
  "mcpServers": {
    "glm-ocr": {
      "command": "uvx",
      "args": ["glm-ocr-mcp"],
      "env": {
        "ZHIPU_API_KEY": "your_api_key_here",
        "ZHIPU_OCR_API_URL": "https://open.bigmodel.cn/api/paas/v4/layout_parsing"
      }
    }
  }
}

Using with Claude Code

claude mcp add --scope user glm-ocr \
  --env ZHIPU_API_KEY=your_api_key_here \
  --env ZHIPU_OCR_API_URL=https://open.bigmodel.cn/api/paas/v4/layout_parsing \
  -- uvx glm-ocr-mcp

Using with Codex

Add MCP server with command:

codex mcp add glm-ocr \
  --env ZHIPU_API_KEY=your_api_key_here \
  --env ZHIPU_OCR_API_URL=https://open.bigmodel.cn/api/paas/v4/layout_parsing \
  -- uvx glm-ocr-mcp

Tools

The server provides one tool:

  • extract_text: Extract from local file or URL (png, jpg/jpeg, pdf)
    • default returns Markdown text
    • set return_json=true to return structured JSON without md_results (contains page parsing details like bbox_2d, content, label, etc.)

Parameters:

  • file_path: Local file path or URL for png, jpg/jpeg, or pdf
  • base64_data: Optional data URL/base64 payload (use when file_path is unavailable)
  • start_page_id: Optional PDF start page (1-based, only effective for PDF)
  • end_page_id: Optional PDF end page (1-based, only effective for PDF)
  • return_json: Optional boolean, default false. true returns JSON; false returns Markdown.

Examples

# Extract text from local image
extract_text(file_path="./screenshot.png")

# Extract text from local PDF
extract_text(file_path="./document.pdf")

# Extract text from URL image
extract_text(file_path="https://example.com/test.jpg")

# Use base64/data URL
extract_text(base64_data="data:image/png;base64,iVBORw0KGgo...")

# Extract structured layout JSON
extract_text(file_path="https://example.com/test.png", return_json=True)

Development

# Create virtual environment
uv venv
source .venv/bin/activate

# Sync dependencies and install current project
uv sync

# Run server for testing
python -m glm_ocr_mcp.server

Windows PowerShell activation:

.venv\Scripts\Activate.ps1

Project Structure

glm-ocr-mcp/
├── pyproject.toml         # Project configuration
├── README.md              # Documentation
├── .env.example           # Environment variable template
├── src/
│   └── glm_ocr_mcp/
│       ├── __init__.py
│       ├── __main__.py    # Entry point
│       ├── ocr.py         # OCR client
│       └── server.py      # MCP server

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glm_ocr_mcp-0.1.1.tar.gz (48.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

glm_ocr_mcp-0.1.1-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file glm_ocr_mcp-0.1.1.tar.gz.

File metadata

  • Download URL: glm_ocr_mcp-0.1.1.tar.gz
  • Upload date:
  • Size: 48.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for glm_ocr_mcp-0.1.1.tar.gz
Algorithm Hash digest
SHA256 304645f10e3444698228f46603b7a99f86cdab306b45fc0fbb86f19fc51e4f4c
MD5 30fdab202b97a9945598eeff3276f772
BLAKE2b-256 b6d1044ae34b093fa6af9cc5f0ceb161b280e9b8110d02d5bb7a75326c874146

See more details on using hashes here.

File details

Details for the file glm_ocr_mcp-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: glm_ocr_mcp-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 6.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for glm_ocr_mcp-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 36aae77499fc1dfca90f205944c05a8809eba3e4f39bf5031b0f12266ebf1c6c
MD5 026322930d869507b260d88a0bf2d569
BLAKE2b-256 0c4cca108b465b9a3c2ef9a7b0851fc3f84012edfe28e2365675ee9bc9505b13

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page