MCP server that translates documents (PPTX, PDF, DOCX, XLSX) preserving layout, with optional Gemini image translation

These details have not been verified by PyPI

Project description

doc-translator-mcp / 文档翻译 MCP

An MCP server that translates documents (PPTX, PDF, DOCX, XLSX) while preserving the original layout — including text in images. Works with any LLM client that supports MCP.

一个 MCP 服务器，可以翻译文档（PPTX、PDF、DOCX、XLSX），保持原始排版格式不变，并支持翻译图片中嵌入的文字。兼容所有支持 MCP 协议的 LLM 客户端。

Features / 功能亮点

Any language pair — Chinese ↔ English, French ↔ Japanese, or any combination the host LLM supports
No API key required for text translation — uses whatever LLM is running in your client
Optional image translation — set GEMINI_API_KEY to automatically translate text in screenshots, diagrams, and charts via Gemini
Format preservation — fonts, colors, sizes, positioning all maintained; font size auto-scales for longer translations
Smart image filtering — skips icons, logos, and duplicates to minimize API calls
Direct file access — runs locally, reads and writes files directly on your machine

Supported Formats / 支持格式

Format	Extension	Text	Images	Rebuild	Summary
PowerPoint	`.pptx`	✅	✅	✅	✅ (new slide)
PDF	`.pdf`	✅	—	✅	✅ (new page)
Word	`.docx`	✅	—	✅	✅ (prepended)
Excel	`.xlsx`	✅	—	✅	✅ (new sheet)

Quick Start / 快速开始

Option 1: One-line install (recommended)

Add to your MCP client config (Cursor, CodeBuddy, Claude Desktop, etc.):

{
  "mcpServers": {
    "doc-translator": {
      "command": "uvx",
      "args": ["doc-translator-mcp"],
      "env": {
        "GEMINI_API_KEY": "<your-google-ai-api-key>"
      }
    }
  }
}

Requires uv installed. This downloads and runs the MCP automatically.

Option 2: Install from PyPI

pip install doc-translator-mcp

Then add to your MCP client config:

{
  "mcpServers": {
    "doc-translator": {
      "command": "doc-translator-mcp",
      "env": {
        "GEMINI_API_KEY": "<your-google-ai-api-key>"
      }
    }
  }
}

Option 3: Install from source

git clone https://github.com/camushlm/doc-translator-mcp.git
cd doc-translator-mcp
uv venv --python 3.12
source .venv/bin/activate
uv pip install -e .

Then use the same config as Option 2.

GEMINI_API_KEY is optional. Without it, text translation works fully; image translation falls back to LLM-powered annotations. Get a free key at https://aistudio.google.com/apikey

How It Works / 工作原理

User: "Translate this PPT to English"
       请帮我把这个PPT翻译成英文

  1. extract_document(file_path)
     → Returns text blocks with IDs and context
       返回带有唯一ID和上下文的文本块

  2. LLM translates each text block
     LLM 翻译每个文本块

  3. extract_images(file_path)             [PPTX only]
     → Filters out icons, logos, duplicates
       过滤掉图标、Logo 和重复图片
     → If GEMINI_API_KEY: translate_images() for automatic translation
       如有 API Key：自动翻译图片中的文字
     → If no key: LLM adds text annotations beside images
       如无 Key：LLM 在图片旁添加文字注释

  4. rebuild_document(translations, image_replacements)
     → Produces translated file preserving original layout
       生成保持原始排版的翻译文档

Translation Modes / 翻译模式

Mode	Requirements	What it does
Text only	Any LLM, no API key	`extract → translate → rebuild`
Full (text + images)	`GEMINI_API_KEY`	Text + Gemini regenerates images with translated text
Annotate (text + captions)	Multimodal LLM, no API key	Text + LLM describes image text → caption boxes added

Tools / 工具列表

`extract_document`

Extract all translatable text blocks from a document.

Parameter	Type	Required	Description
`file_path`	string	✅	Absolute path to the document

Returns JSON with text_blocks — each has id, text, and context.

`extract_images`

Extract images from a document for inspection or translation (PPTX only).

Parameter	Type	Required	Description
`file_path`	string	✅	Path to the document
`output_dir`	string	—	Directory for saved images

Returns JSON with images, mode (full/annotate), and guidance.

`translate_images`

Translate text within images using Gemini. Requires GEMINI_API_KEY.

Parameter	Type	Required	Description
`file_path`	string	✅	Path to the document
`target_language`	string	—	Target language (default: English)
`source_language`	string	—	Source language (auto-detect if empty)
`output_dir`	string	—	Directory for images
`gemini_api_key`	string	—	API key (falls back to env var)
`gemini_model`	string	—	Model (default: gemini-3.1-flash-image-preview)

Returns JSON with image_replacements mapping for rebuild_document.

`rebuild_document`

Rebuild a document with translated text and optionally replaced/annotated images.

Parameter	Type	Required	Description
`source_file_path`	string	✅	Path to the original document
`translations`	string	✅	JSON `{block_id: translated_text}`
`summary`	string	—	Summary text for the first page
`output_file_path`	string	—	Custom output path
`image_replacements`	string	—	JSON `{image_id: "/path/to/translated.png"}`
`image_annotations`	string	—	JSON `{image_id: "caption text"}`

Returns JSON with path to the translated file.

`list_supported_formats`

Returns supported formats, current image translation mode, and recommended workflow.

Example / 使用示例

You: 请帮我把这个PPT翻译成英文 /path/to/报告.pptx

AI: [calls extract_document] → 281 text blocks extracted
AI: [translates all blocks Chinese → English]
AI: [calls extract_images] → 16 unique images found
AI: [calls translate_images] → 11 images with text translated via Gemini
AI: [calls rebuild_document with translations + image_replacements]
AI: Done! Saved to /path/to/报告_translated.pptx
    281 text blocks and 11 images translated.

Architecture / 技术架构

┌─────────────────────────────────┐
│  LLM Client (Cursor / CodeBuddy / Claude)  │
│  └─ uses its own LLM for text translation  │
└──────────────┬──────────────────┘
               │ MCP protocol (STDIO)
               ▼
┌─────────────────────────────────┐
│  doc-translator-mcp             │
│  ├─ server.py    (MCP tools)    │
│  ├─ pptx_handler (text+images)  │
│  ├─ pdf_handler  (text overlay) │
│  ├─ docx_handler (text replace) │
│  └─ xlsx_handler (cell replace) │
└──────────────┬──────────────────┘
               │ optional
               ▼
┌─────────────────────────────────┐
│  Gemini API (image translation) │
│  gemini-3.1-flash-image-preview │
└─────────────────────────────────┘

Environment Variables / 环境变量

Variable	Required	Description
`GEMINI_API_KEY`	Optional	Google AI API key for image translation. Get free at https://aistudio.google.com/apikey

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.3.0

Apr 16, 2026

1.2.0

Apr 6, 2026

1.1.2

Apr 6, 2026

1.1.1

Apr 6, 2026

1.1.0

Apr 6, 2026

This version

1.0.0

Apr 6, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

doc_translator_mcp-1.0.0.tar.gz (16.9 kB view details)

Uploaded Apr 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

doc_translator_mcp-1.0.0-py3-none-any.whl (17.2 kB view details)

Uploaded Apr 6, 2026 Python 3

File details

Details for the file doc_translator_mcp-1.0.0.tar.gz.

File metadata

Download URL: doc_translator_mcp-1.0.0.tar.gz
Upload date: Apr 6, 2026
Size: 16.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for doc_translator_mcp-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`8b6ad2ff148b75fc7a2e7969cbadc6a8c58de12e4c7d4cf20a8f5c66a7daffe2`
MD5	`09049a8db2c846b5c19f0e57833e9d73`
BLAKE2b-256	`d944fabd8c3e9836b71fa3038e59847f998cee4b1886d878536c396ab2fd39b4`

See more details on using hashes here.

File details

Details for the file doc_translator_mcp-1.0.0-py3-none-any.whl.

File metadata

Download URL: doc_translator_mcp-1.0.0-py3-none-any.whl
Upload date: Apr 6, 2026
Size: 17.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for doc_translator_mcp-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`70c60e1e2c37b69554fc709927a8b52f17c2495743bf8d4faeaeb11a3cb4b46f`
MD5	`9ae025d942e0d584b5cbc728c358dedc`
BLAKE2b-256	`e55c762008a5dd9d1b555b6d7067fbc1d494173531bb8752d9810c2b53661348`

See more details on using hashes here.

doc-translator-mcp 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

doc-translator-mcp / 文档翻译 MCP

Features / 功能亮点

Supported Formats / 支持格式

Quick Start / 快速开始

Option 1: One-line install (recommended)

Option 2: Install from PyPI

Option 3: Install from source

How It Works / 工作原理

Translation Modes / 翻译模式

Tools / 工具列表

extract_document

extract_images

translate_images

rebuild_document

list_supported_formats

Example / 使用示例

Architecture / 技术架构

Environment Variables / 环境变量

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`extract_document`

`extract_images`

`translate_images`

`rebuild_document`

`list_supported_formats`