Skip to main content

MCP server that translates documents (PPTX, PDF, DOCX, XLSX) preserving layout, with optional Gemini image translation

Project description

doc-translator-mcp / 文档翻译 MCP

PyPI Python License

An MCP server that translates documents (PPTX, PDF, DOCX, XLSX) while preserving the original layout — including text embedded in images. Works with any LLM client that supports MCP.

一个 MCP 服务器,可翻译文档(PPTX、PDF、DOCX、XLSX),保持原始排版格式不变,并支持翻译图片中嵌入的文字。兼容所有支持 MCP 协议的 LLM 客户端。

Features / 功能亮点

  • Any language pair — Chinese ↔ English, French ↔ Japanese, or any combination the host LLM supports
  • No API key required for text translation — uses whatever LLM is running in your client
  • Optional image translation — set GEMINI_API_KEY to automatically translate text in screenshots, diagrams, and charts via Gemini
  • Format preservation — fonts, colors, sizes, positioning all maintained; font size auto-scales for longer translations
  • Smart image filtering — skips icons, logos, and duplicates to minimize API calls
  • Best with Claude — tested and optimized for Claude; other LLMs may work but results vary

Supported Formats / 支持格式

Format Extension Text Images Rebuild Summary
PowerPoint .pptx ✅ (new slide)
PDF .pdf ✅ (new page)
Word .docx ✅ (prepended)
Excel .xlsx ✅ (new sheet)

Install / 安装

Add to your MCP client config (Cursor, CodeBuddy, Claude Desktop, etc.):

{
  "mcpServers": {
    "doc-translator": {
      "command": "uvx",
      "args": ["doc-translator-mcp"],
      "env": {
        "GEMINI_API_KEY": "<optional-google-ai-api-key>"
      }
    }
  }
}

That's it. uv downloads and runs the server automatically.

GEMINI_API_KEY is optional. Without it, text translation works fully; only image translation (PPTX) requires it. Get a free key at https://aistudio.google.com/apikey

Alternative: pip install

pip install doc-translator-mcp

Then use "command": "doc-translator-mcp" instead of uvx.

How It Works / 工作原理

User: "Translate this PPT to English"
       请帮我把这个PPT翻译成英文

  1. extract_document(file_path)
     → Returns text blocks with IDs and context
       返回带有唯一ID和上下文的文本块

  2. LLM translates each text block
     LLM 翻译每个文本块

  3. extract_images(file_path)             [PPTX only]
     → Filters out icons, logos, duplicates
       过滤掉图标、Logo 和重复图片
     → If GEMINI_API_KEY: translate_images() for auto translation
       如有 API Key:自动翻译图片中的文字
     → If no key: LLM adds text annotations beside images
       如无 Key:LLM 在图片旁添加文字注释

  4. rebuild_document(translations, image_replacements)
     → Produces translated file preserving original layout
       生成保持原始排版的翻译文档

Translation Modes / 翻译模式

Mode Requirements What it does
Text only Any LLM, no API key extract → translate → rebuild
Full (text + images) GEMINI_API_KEY Text + Gemini regenerates images with translated text
Annotate (text + captions) Multimodal LLM, no API key Text + LLM describes image text → caption boxes added

Tools / 工具列表

Tool Description
extract_document Extract translatable text blocks from a document
extract_images Extract images from PPTX for inspection or translation
translate_images Translate text in images via Gemini (requires API key)
rebuild_document Rebuild document with translated text and images
list_supported_formats List supported formats and current workflow

Example / 使用示例

You: 请帮我把这个PPT翻译成英文 /path/to/报告.pptx

AI: [calls extract_document] → 281 text blocks extracted
AI: [translates all blocks Chinese → English]
AI: [calls extract_images] → 16 unique images found
AI: [calls translate_images] → 11 images translated via Gemini
AI: [calls rebuild_document]
AI: Done! Saved to /path/to/报告_translated.pptx

Architecture / 技术架构

┌──────────────────────────────────────────┐
│  LLM Client (Cursor / CodeBuddy / Claude)│
│  └─ uses its own LLM for text translation│
└──────────────────┬───────────────────────┘
                   │ MCP (STDIO)
                   ▼
┌──────────────────────────────────────────┐
│  doc-translator-mcp                      │
│  ├─ pptx_handler  (text + images)        │
│  ├─ pdf_handler   (text overlay)         │
│  ├─ docx_handler  (text replace)         │
│  └─ xlsx_handler  (cell replace)         │
└──────────────────┬───────────────────────┘
                   │ optional
                   ▼
┌──────────────────────────────────────────┐
│  Gemini API (image translation)          │
└──────────────────────────────────────────┘

Environment Variables / 环境变量

Variable Required Description
GEMINI_API_KEY Optional Google AI API key for image translation. Free at https://aistudio.google.com/apikey

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

doc_translator_mcp-1.3.0.tar.gz (16.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

doc_translator_mcp-1.3.0-py3-none-any.whl (17.8 kB view details)

Uploaded Python 3

File details

Details for the file doc_translator_mcp-1.3.0.tar.gz.

File metadata

  • Download URL: doc_translator_mcp-1.3.0.tar.gz
  • Upload date:
  • Size: 16.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for doc_translator_mcp-1.3.0.tar.gz
Algorithm Hash digest
SHA256 692412bf3597b3356a83e9c41773c694fbd3d1df04233c008a12d675c72ebbd6
MD5 4b56e28582ab2d9c90df1ec654ce885e
BLAKE2b-256 3e9f94c9ead37df8879a85f0e6483508aa81a1bf356a157063ad00f01217c9be

See more details on using hashes here.

File details

Details for the file doc_translator_mcp-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: doc_translator_mcp-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 17.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for doc_translator_mcp-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ae33e01d621e3ce798acd92c657ced1989808ea08fa5432ad980991394f66cc3
MD5 edb76d45201cfd51e9c0505251317b08
BLAKE2b-256 ebc2c8420f4379bcaa345532118031a13623f65e88d8d9458815f972e7f0c445

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page