MCP server for GLM OCR to extract text from images and PDFs

Project description

GLM OCR MCP Server

MCP server for extracting text from images and PDFs using ZhipuAI GLM-OCR.

Usage

{
  "mcpServers": {
    "glm-ocr": {
      "command": "uvx",
      "args": ["glm-ocr-mcp"],
      "env": {
        "ZHIPU_API_KEY": "your_api_key_here",
        "ZHIPU_OCR_API_URL": "https://open.bigmodel.cn/api/paas/v4/layout_parsing"
      }
    }
  }
}

Using with Claude Code

claude mcp add --scope user glm-ocr \
  --env ZHIPU_API_KEY=your_api_key_here \
  --env ZHIPU_OCR_API_URL=https://open.bigmodel.cn/api/paas/v4/layout_parsing \
  -- uvx glm-ocr-mcp

Using with Codex

Add MCP server with command:

codex mcp add glm-ocr \
  --env ZHIPU_API_KEY=your_api_key_here \
  --env ZHIPU_OCR_API_URL=https://open.bigmodel.cn/api/paas/v4/layout_parsing \
  -- uvx glm-ocr-mcp

Tools

The server provides one tool:

extract_text: Extract from local file or URL (png, jpg/jpeg, pdf)
- default returns Markdown text
- set return_json=true to return structured JSON without md_results (contains page parsing details like bbox_2d, content, label, etc.)

Parameters:

file_path: Local file path or URL for png, jpg/jpeg, or pdf
base64_data: Optional data URL/base64 payload (use when file_path is unavailable)
start_page_id: Optional PDF start page (1-based, only effective for PDF)
end_page_id: Optional PDF end page (1-based, only effective for PDF)
return_json: Optional boolean, default false. true returns JSON; false returns Markdown.

Examples

# Extract text from local image
extract_text(file_path="./screenshot.png")

# Extract text from local PDF
extract_text(file_path="./document.pdf")

# Extract text from URL image
extract_text(file_path="https://example.com/test.jpg")

# Use base64/data URL
extract_text(base64_data="data:image/png;base64,iVBORw0KGgo...")

# Extract structured layout JSON
extract_text(file_path="https://example.com/test.png", return_json=True)

Development

# Create virtual environment
uv venv
source .venv/bin/activate

# Sync dependencies and install current project
uv sync

# Run server for testing
python -m glm_ocr_mcp.server

Windows PowerShell activation:

.venv\Scripts\Activate.ps1

Project Structure

glm-ocr-mcp/
├── pyproject.toml         # Project configuration
├── README.md              # Documentation
├── .env.example           # Environment variable template
├── src/
│   └── glm_ocr_mcp/
│       ├── __init__.py
│       ├── __main__.py    # Entry point
│       ├── ocr.py         # OCR client
│       └── server.py      # MCP server

Project details

Release history Release notifications | RSS feed

This version

0.1.1

Feb 10, 2026

0.1.0

Feb 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glm_ocr_mcp-0.1.1.tar.gz (48.4 kB view details)

Uploaded Feb 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

glm_ocr_mcp-0.1.1-py3-none-any.whl (6.4 kB view details)

Uploaded Feb 10, 2026 Python 3

File details

Details for the file glm_ocr_mcp-0.1.1.tar.gz.

File metadata

Download URL: glm_ocr_mcp-0.1.1.tar.gz
Upload date: Feb 10, 2026
Size: 48.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for glm_ocr_mcp-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`304645f10e3444698228f46603b7a99f86cdab306b45fc0fbb86f19fc51e4f4c`
MD5	`30fdab202b97a9945598eeff3276f772`
BLAKE2b-256	`b6d1044ae34b093fa6af9cc5f0ceb161b280e9b8110d02d5bb7a75326c874146`

See more details on using hashes here.

File details

Details for the file glm_ocr_mcp-0.1.1-py3-none-any.whl.

File metadata

Download URL: glm_ocr_mcp-0.1.1-py3-none-any.whl
Upload date: Feb 10, 2026
Size: 6.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for glm_ocr_mcp-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`36aae77499fc1dfca90f205944c05a8809eba3e4f39bf5031b0f12266ebf1c6c`
MD5	`026322930d869507b260d88a0bf2d569`
BLAKE2b-256	`0c4cca108b465b9a3c2ef9a7b0851fc3f84012edfe28e2365675ee9bc9505b13`

See more details on using hashes here.

glm-ocr-mcp 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

GLM OCR MCP Server

Usage

Using with Claude Code

Using with Codex

Tools

Examples

Development

Project Structure

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes