Skip to main content

AI-native localization pipeline with automated quality control

Project description

Omni-Localizer (OL)

AI-native localization pipeline that translates documents through intelligent LLM routing with built-in quality control.

What It Does

  • Translate documents (Markdown, XLIFF) using LLM APIs
  • Automatic failover — switches to backup model if primary fails
  • Quality preservation — shields code blocks, links, images during translation
  • LLM-based judging — evaluates translation accuracy and fluency
  • Restoration layer — uses LLM to restore placeholders after translation

Quick Start

1. Install

pip install -e .

2. Configure API Keys

Create a .bat file (gitignored) with your API keys:

@echo off
set OPENAI_API_KEY=your_api_key
set PYTHONPATH=src
python -m ol_cli translate-md %* -c config/default.yaml -s en -t zh

3. Run

test_en_to_zh.bat your_document.md -o output/

Configuration

config/default.yaml — Example LLM pool configuration:

llm_pool:
  translation:
    - provider: "openai"
      model: "gpt-4o-mini"
      priority: 1
      api_key: "${OPENAI_API_KEY}"
      role: "translation"
    - provider: "openai"
      model: "gpt-4o"
      priority: 2
      api_key: "${OPENAI_API_KEY}"
      role: "translation"
  judging:
    - provider: "openai"
      model: "gpt-4o-mini"
      priority: 1
      api_key: "${OPENAI_API_KEY}"
      role: "judging"
  restoration:
    - provider: "openai"
      model: "gpt-4o-mini"
      priority: 1
      api_key: "${OPENAI_API_KEY}"
      role: "restoration"

CLI Commands

# Translate markdown
ol translate-md <file.md> -c <config.yaml> -s en -t zh -o output/

# Translate XLIFF
ol translate-xliff <file.xlf> -c <config.yaml> -s en -t zh -o output/

# Extract warnings from file
ol extract-warnings <file> -o warnings.md

Output Metadata

YAML Frontmatter (Markdown)

When translating Markdown files, OL automatically adds YAML frontmatter to the output:

---
source_lang: en
target_lang: zh
original_file: input.md
processor: "OL"
version: "0.1.0"
translated_at: 2026-05-22T15:00:00Z
---

# Content follows...

CLI Control:

# Enable frontmatter (default)
ol translate-md input.md -s en -t zh -o output/

# Disable frontmatter
ol translate-md input.md -s en -t zh -o output/ --no-frontmatter

XLIFF Header Note

When translating XLIFF files, OL adds a header note with translation metadata:

<?xml version="1.0" encoding="utf-8"?>
<xliff version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2">
  <header>
    <note from="OL">Translated from en to zh by OL</note>
  </header>
  <file original="input.xlf" source-language="en" target-language="zh">
    ...
  </file>
</xliff>

Batch Processing

Batch translate supports the same frontmatter options:

# With frontmatter (default)
ol translate-batch ./docs/ -s en -t zh -o output/

# Without frontmatter
ol translate-batch ./docs/ -s en -t zh -o output/ --no-frontmatter

Key Features

Feature Description
Model Pool Failover LiteLLM router with primary + backup models per role
Content Shielding Code blocks, links, images preserved during translation
4-Layer Repair Regex → Span alignment → LLM restoration → Safe fallback
Translation + Judging JudgeService evaluates quality (adequacy, fluency, terminology)
TM Integration hypomnema for translation memory lookups

Architecture

  • MD Channel: Token Stream + 4-layer semantic repair
  • XLIFF Channel: translate-toolkit based
  • LLM Routing: LiteLLM with model pool failover
  • LQA: openevalkit Scorer→Judge + COMET
  • TM: hypomnema (TMX)
  • Alignment: span-aligner + VectorAlign

Agent Usage

Omni-Localizer can be used as a skill by coding agents (OpenCode, Hermes). Agents read the SKILL.md file to understand how to invoke translation.

OpenCode

  1. Add the skill to your project:

    cp -r src/.opencode/skills/ol-localizer <your-project>/.opencode/skills/
    
  2. Reference it in your OpenCode configuration if needed

For detailed usage, see src/.opencode/skills/ol-localizer/SKILL.md

Hermes

  1. Copy or symlink the skill:

    cp -r src/.hermes/skills/ol-localizer ~/.hermes/skills/
    
  2. Restart Hermes to activate

For detailed usage, see src/.hermes/skills/ol-localizer/SKILL.md

Environment Variables

Configure your LLM provider API keys in your shell environment.

Testing the Agent Integration

Verify skill files exist:

ls src/.opencode/skills/ol-localizer/SKILL.md
ls src/.hermes/skills/ol-localizer/SKILL.md

Test JSON output (machine-readable for agents):

python -m ol_cli translate-md input.md -c config/default.yaml -s en -t zh -o output/ --json

Expected JSON output:

{"success": true, "input_file": "input.md", "output_file": "output/input.md", "source_lang": "en", "target_lang": "zh"}

Run skill tests:

pytest tests/test_opencode_skill.py tests/test_hermes_skill.py -v

Verify --json flag in help:

python -m ol_cli translate-md --help | grep json

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omni_localizer-0.2.0.tar.gz (91.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

omni_localizer-0.2.0-py3-none-any.whl (55.5 kB view details)

Uploaded Python 3

File details

Details for the file omni_localizer-0.2.0.tar.gz.

File metadata

  • Download URL: omni_localizer-0.2.0.tar.gz
  • Upload date:
  • Size: 91.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for omni_localizer-0.2.0.tar.gz
Algorithm Hash digest
SHA256 c017b60c731a2cf818ab796d662dde73d5fef05291b9272afef75be8319f7697
MD5 cbbe83695a0d42401b5cd0493a6674fe
BLAKE2b-256 6ce3c0b431df6c7b83000ef44c91ce3e72e49d9d49c57cb3c57a32fdce257072

See more details on using hashes here.

Provenance

The following attestation bundles were made for omni_localizer-0.2.0.tar.gz:

Publisher: publish.yml on 1StepMore/Omni_Localizer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file omni_localizer-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: omni_localizer-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 55.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for omni_localizer-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2f9e6898814680c1fcc25700d7e87ab89f9e10c86902a50b3633d70ef09b4482
MD5 e98ae4d8d6111ce31059bcd70860df61
BLAKE2b-256 3fe6b6553b052076dbdb997bf3fceeee0db9b3d9d761d5878ac79a21b9384192

See more details on using hashes here.

Provenance

The following attestation bundles were made for omni_localizer-0.2.0-py3-none-any.whl:

Publisher: publish.yml on 1StepMore/Omni_Localizer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page