A Model Context Protocol server providing OCR capabilities for images and videos using GLM-4.1V-Thinking-Flash
Project description
MCP OCR Server
A Model Context Protocol (MCP) server that provides OCR (Optical Character Recognition) capabilities for images and videos using GLM-4.1V-Thinking-Flash.
Features
- Image OCR: Extract text from images using GLM-4.1V-Thinking-Flash
- Video OCR: Extract text from video frames (planned feature)
- Custom Prompts: Support for custom OCR prompts
- MCP Integration: Full MCP protocol support for seamless integration
Requirements
- Python 3.10+
- GLM-4.1V-Thinking-Flash API key from ZhipuAI
Quick Start with uvx
The fastest way to get started is using uvx:
# Install and run directly
uvx mcp-server-ocr
# Or install for development
uvx --python 3.10 --with-editable . mcp-server-ocr
Installation
Using uv (recommended)
# Clone the repository
git clone <repository-url>
cd ocr
# Install dependencies
uv sync
# Run the server
uv run mcp-server-ocr
Using pip
# Clone the repository
git clone <repository-url>
cd ocr
# Install dependencies
pip install -e .
# Run the server
mcp-server-ocr
Configuration
Set your GLM-4.1V-Thinking-Flash API key as an environment variable:
export ZHIPU_API_KEY="your-api-key-here"
Usage
Available Tools
-
ocr_image: Perform OCR on image files
image_path: Path to the image fileprompt: Custom prompt for OCR processing (optional)
-
ocr_video: Perform OCR on video frames (coming soon)
video_path: Path to the video fileprompt: Custom prompt for video OCR processing (optional)frame_interval: Extract frames every N seconds (optional)
Available Prompts
- ocr_image: Extract text from an image
- ocr_video: Extract text from video frames
Build and Publish
Build the package
# Using uv
uv build
# Using pip
python -m build
Publish to PyPI
# Using uv
uv publish
# Using twine
twine upload dist/*
Development
Setup development environment
# Clone and setup
git checkout -b feature/your-feature
uv sync --dev
# Run linting and type checking
uv run ruff check .
uv run pyright .
Testing
# Run tests (when available)
uv run pytest
# Test the server manually
uv run python -m mcp_server_ocr
API Documentation
MCP Protocol
This server implements the Model Context Protocol (MCP) specification:
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mcp_server_ocr-0.1.0.tar.gz.
File metadata
- Download URL: mcp_server_ocr-0.1.0.tar.gz
- Upload date:
- Size: 4.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
22fc18d4dbbf4a627c27a709a5c2c77ba34dd5a9b02671e773b92c9ca71b4f34
|
|
| MD5 |
d7b7c71c2c14925bc067e69fa7ffc91c
|
|
| BLAKE2b-256 |
976cee8e02ed64f620fcfeaae2e63818aa76ab48aab7e8f7f5099087096d2ef3
|
File details
Details for the file mcp_server_ocr-0.1.0-py3-none-any.whl.
File metadata
- Download URL: mcp_server_ocr-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9f75416dfdca3ae7fb9d1bef017d3e9eb91c9e979cf356f32237f7450ce5e6be
|
|
| MD5 |
3ab17b8f06036cd629692805e6b89b98
|
|
| BLAKE2b-256 |
c6fb0e0660fc469254cc6f8a28ca1ae691339711167a0241a74e02479f7c2f71
|