A high-performance markdown translator that supports multiple languages and preserves markdown formatting.
Project description
ZMP Markdown Translator
A high-performance markdown translator that supports multiple languages and preserves markdown formatting. Uses OpenAI's GPT models for translation.
Features
- Translates entire directories of markdown files
- Preserves markdown formatting and structure
- Supports multiple target languages simultaneously
- Handles large files through automatic chunking
- Maintains Docusaurus-compatible directory structures
- Shows real-time progress with colorized output
- Enforces consistent translation rules for technical documentation
Translation Rules
The translator follows strict rules to ensure consistent and accurate translations:
Preserved in English
- Front matter between
---markers (including id, title, sidebar_position) - All section headers (including those with HTML tags)
- All HTML-wrapped headers and subheadings
- Product names (e.g., Cloud Z CP)
- Platform names (e.g., Kubernetes)
- Service names (e.g., Container Management Service)
- Tool names (e.g., Chrome, Gitea)
- Version numbers and technical specifications
- Table headers in markdown tables
- Section IDs in curly braces
Translated to Target Language
- All paragraphs and content text
- Descriptions and explanations
- UI messages and instructions
- List items and bullet points
- Table content (except headers)
- Sentences containing product names (while keeping the names in English)
Consistency Requirements
- No mixed language content within sentences
- Complete translation of all paragraphs
- Preservation of markdown structure and formatting
- Exact maintenance of whitespace and line numbers
Installation
# Using Poetry (recommended)
poetry install
# Or using pip
pip install zmp-md-translator
Usage
Basic Command Structure
zmp-translate \
--source-dir SOURCE_PATH \
--target-dir TARGET_DIR \
--languages LANG_CODES \
--solution SOLUTION_TYPE
Required parameters:
SOURCE_PATH: Path to a markdown file or directory containing markdown filesLANG_CODES: Comma-separated list of target language codesSOLUTION_TYPE: Type of documentation (zcp, apim, or amdp)
Optional parameters:
TARGET_DIR: Target directory for translations (default: "i18n")--model: OpenAI model to use (overrides .env setting)--chunk-size: Maximum chunk size for translation--concurrent: Maximum concurrent requests
Example Usage
# Translate a directory
zmp-translate \
--source-dir "./repo/docs/zcp/v2.0" \
--target-dir "./repo/i18n" \
--languages "ko,ja,zh" \
--solution zcp
# Translate a single file
zmp-translate \
--source-dir "./repo/docs/zcp/v2.0/FAQ.mdx" \
--target-dir "./repo/i18n" \
--languages "ko" \
--solution zcp
Using short options:
# Directory translation
zmp-translate -s "./repo/docs/zcp/v2.0" -t "./repo/i18n" -l "ko,ja,zh" --solution zcp
# Single file translation
zmp-translate -s "./repo/docs/zcp/v2.0/FAQ.mdx" -t "./repo/i18n" -l "ko" --solution zcp
Output Directory Structure
The translator creates a Docusaurus-compatible directory structure:
i18n/
├── ko/
│ └── docusaurus-plugin-content-docs-zcp/
│ └── current/
│ └── v2.0/
│ ├── FAQ.mdx
│ └── ...
├── ja/
│ └── docusaurus-plugin-content-docs-zcp/
│ └── current/
│ └── v2.0/
│ ├── FAQ.mdx
│ └── ...
└── zh/
└── docusaurus-plugin-content-docs-zcp/
└── current/
└── v2.0/
├── FAQ.mdx
└── ...
Supported Language Codes
The following language codes are supported:
| Code | Language |
|---|---|
| ko | Korean |
| fr | French |
| ja | Japanese |
| es | Spanish |
| de | German |
| zh | Chinese |
| ru | Russian |
| it | Italian |
| pt | Portuguese |
| ar | Arabic |
Environment Configuration
Create a .env file in your project root:
# OpenAI Configuration
OPENAI_API_KEY=your-api-key-here
OPENAI_MODEL=your-model-here
# Performance Settings
MAX_CHUNK_SIZE=4000
MAX_CONCURRENT_REQUESTS=5
Development
# Install dependencies
poetry install
# Run tests
poetry run test
# Run with watch mode (development)
poetry run watch
License
This project is distributed under the MIT License. See the LICENSE file for more information.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file zmp_md_translator-0.2.1.tar.gz.
File metadata
- Download URL: zmp_md_translator-0.2.1.tar.gz
- Upload date:
- Size: 18.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.5 CPython/3.11.10 Darwin/24.3.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b57d8a808b9bf7da4c8d04a8d7b003b977c98586c5359e8c27122038a5815dc6
|
|
| MD5 |
9e51b5ce868f36fcc44136d331322441
|
|
| BLAKE2b-256 |
7b357cbda4b7c36f29659a32c22943a5db823e7f11abddc6d699ec951bf05d3a
|
File details
Details for the file zmp_md_translator-0.2.1-py3-none-any.whl.
File metadata
- Download URL: zmp_md_translator-0.2.1-py3-none-any.whl
- Upload date:
- Size: 19.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.5 CPython/3.11.10 Darwin/24.3.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
abc6437aa82f084e6cee2b006c31faa41f610c4b3aad75f3c9647da9dd68d5ad
|
|
| MD5 |
cb7bd74a342e005c8bbbc36da8124a92
|
|
| BLAKE2b-256 |
f918e4132849d6d6934631867e24e155b4eadc6f96048d41d160e1eca28c4073
|