Skip to main content

A high-performance markdown translator that supports multiple languages and preserves markdown formatting.

Project description

ZMP Markdown Translator

Platform Badge Component Badge CI Badge License Badge PyPI - Version PyPI - Implementation PyPI - Python Version PyPI - Wheel

A high-performance markdown translator that supports multiple languages and preserves markdown formatting. Uses OpenAI's GPT models for translation.

Features

  • Translates entire directories of markdown files
  • Preserves markdown formatting and structure
  • Supports multiple target languages simultaneously
  • Handles large files through automatic chunking
  • Maintains Docusaurus-compatible directory structures
  • Shows real-time progress with colorized output
  • Enforces consistent translation rules for technical documentation

Translation Rules

The translator follows strict rules to ensure consistent and accurate translations:

Preserved in English

  • Front matter between --- markers (including id, title, sidebar_position)
  • All section headers (including those with HTML tags)
  • All HTML-wrapped headers and subheadings
  • Product names (e.g., Cloud Z CP)
  • Platform names (e.g., Kubernetes)
  • Service names (e.g., Container Management Service)
  • Tool names (e.g., Chrome, Gitea)
  • Version numbers and technical specifications
  • Table headers in markdown tables
  • Section IDs in curly braces

Translated to Target Language

  • All paragraphs and content text
  • Descriptions and explanations
  • UI messages and instructions
  • List items and bullet points
  • Table content (except headers)
  • Sentences containing product names (while keeping the names in English)

Consistency Requirements

  • No mixed language content within sentences
  • Complete translation of all paragraphs
  • Preservation of markdown structure and formatting
  • Exact maintenance of whitespace and line numbers

Installation

# Using Poetry (recommended)
poetry install

# Or using pip
pip install zmp-md-translator

Usage

Basic Command Structure

zmp-translate \
  --source-dir SOURCE_PATH \
  --target-dir TARGET_DIR \
  --languages LANG_CODES \
  --solution SOLUTION_TYPE

Required parameters:

  • SOURCE_PATH: Path to a markdown file or directory containing markdown files
  • LANG_CODES: Comma-separated list of target language codes
  • SOLUTION_TYPE: Type of documentation (zcp, apim, or amdp)

Optional parameters:

  • TARGET_DIR: Target directory for translations (default: "i18n")
  • --model: OpenAI model to use (overrides .env setting)
  • --chunk-size: Maximum chunk size for translation
  • --concurrent: Maximum concurrent requests

Example Usage

# Translate a directory
zmp-translate \
  --source-dir "./repo/docs/zcp/v2.0" \
  --target-dir "./repo/i18n" \
  --languages "ko,ja,zh" \
  --solution zcp

# Translate a single file
zmp-translate \
  --source-dir "./repo/docs/zcp/v2.0/FAQ.mdx" \
  --target-dir "./repo/i18n" \
  --languages "ko" \
  --solution zcp

Using short options:

# Directory translation
zmp-translate -s "./repo/docs/zcp/v2.0" -t "./repo/i18n" -l "ko,ja,zh" --solution zcp

# Single file translation
zmp-translate -s "./repo/docs/zcp/v2.0/FAQ.mdx" -t "./repo/i18n" -l "ko" --solution zcp

Output Directory Structure

The translator creates a Docusaurus-compatible directory structure:

i18n/
├── ko/
│   └── docusaurus-plugin-content-docs-zcp/
│       └── current/
│           └── v2.0/
│               ├── FAQ.mdx
│               └── ...
├── ja/
│   └── docusaurus-plugin-content-docs-zcp/
│       └── current/
│           └── v2.0/
│               ├── FAQ.mdx
│               └── ...
└── zh/
    └── docusaurus-plugin-content-docs-zcp/
        └── current/
            └── v2.0/
                ├── FAQ.mdx
                └── ...

Supported Language Codes

The following language codes are supported:

Code Language
ko Korean
fr French
ja Japanese
es Spanish
de German
zh Chinese
ru Russian
it Italian
pt Portuguese
ar Arabic

Environment Configuration

Create a .env file in your project root:

# OpenAI Configuration
OPENAI_API_KEY=your-api-key-here
OPENAI_MODEL=your-model-here

# Performance Settings
MAX_CHUNK_SIZE=4000
MAX_CONCURRENT_REQUESTS=5

Development

# Install dependencies
poetry install

# Run tests
poetry run test

# Run with watch mode (development)
poetry run watch

License

This project is distributed under the MIT License. See the LICENSE file for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zmp_md_translator-0.2.1.tar.gz (18.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zmp_md_translator-0.2.1-py3-none-any.whl (19.8 kB view details)

Uploaded Python 3

File details

Details for the file zmp_md_translator-0.2.1.tar.gz.

File metadata

  • Download URL: zmp_md_translator-0.2.1.tar.gz
  • Upload date:
  • Size: 18.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.11.10 Darwin/24.3.0

File hashes

Hashes for zmp_md_translator-0.2.1.tar.gz
Algorithm Hash digest
SHA256 b57d8a808b9bf7da4c8d04a8d7b003b977c98586c5359e8c27122038a5815dc6
MD5 9e51b5ce868f36fcc44136d331322441
BLAKE2b-256 7b357cbda4b7c36f29659a32c22943a5db823e7f11abddc6d699ec951bf05d3a

See more details on using hashes here.

File details

Details for the file zmp_md_translator-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: zmp_md_translator-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 19.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.11.10 Darwin/24.3.0

File hashes

Hashes for zmp_md_translator-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 abc6437aa82f084e6cee2b006c31faa41f610c4b3aad75f3c9647da9dd68d5ad
MD5 cb7bd74a342e005c8bbbc36da8124a92
BLAKE2b-256 f918e4132849d6d6934631867e24e155b4eadc6f96048d41d160e1eca28c4073

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page