Skip to main content

AI-native localization pipeline with automated quality control

Project description

Omni-Localizer (OL)

AI-native localization pipeline that translates documents through intelligent LLM routing with built-in quality control.

What It Does

  • Translate documents (Markdown, XLIFF) using LLM APIs
  • Automatic failover — switches to backup model if primary fails
  • Quality preservation — shields code blocks, links, images during translation
  • LLM-based judging — evaluates translation accuracy and fluency
  • Restoration layer — uses LLM to restore placeholders after translation

Quick Start

1. Install

pip install -e .

2. Configure API Keys

Create a .bat file (gitignored) with your API keys:

@echo off
set OPENAI_API_KEY=your_api_key
set PYTHONPATH=src
python -m ol_cli translate-md %* -c config/default.yaml -s en -t zh

3. Run

test_en_to_zh.bat your_document.md -o output/

Configuration

config/default.yaml — Example LLM pool configuration:

llm_pool:
  translation:
    - provider: "openai"
      model: "gpt-4o-mini"
      priority: 1
      api_key: "${OPENAI_API_KEY}"
      role: "translation"
    - provider: "openai"
      model: "gpt-4o"
      priority: 2
      api_key: "${OPENAI_API_KEY}"
      role: "translation"
  judging:
    - provider: "openai"
      model: "gpt-4o-mini"
      priority: 1
      api_key: "${OPENAI_API_KEY}"
      role: "judging"
  restoration:
    - provider: "openai"
      model: "gpt-4o-mini"
      priority: 1
      api_key: "${OPENAI_API_KEY}"
      role: "restoration"

CLI Commands

# Translate markdown
ol translate-md <file.md> -c <config.yaml> -s en -t zh -o output/

# Translate XLIFF
ol translate-xliff <file.xlf> -c <config.yaml> -s en -t zh -o output/

# Extract warnings from file
ol extract-warnings <file> -o warnings.md

Key Features

Feature Description
Model Pool Failover LiteLLM router with primary + backup models per role
Content Shielding Code blocks, links, images preserved during translation
4-Layer Repair Regex → Span alignment → LLM restoration → Safe fallback
Translation + Judging JudgeService evaluates quality (adequacy, fluency, terminology)
TM Integration hypomnema for translation memory lookups

Architecture

  • MD Channel: Token Stream + 4-layer semantic repair
  • XLIFF Channel: translate-toolkit based
  • LLM Routing: LiteLLM with model pool failover
  • LQA: openevalkit Scorer→Judge + COMET
  • TM: hypomnema (TMX)
  • Alignment: span-aligner + VectorAlign

Agent Usage

Omni-Localizer can be used as a skill by coding agents (OpenCode, Hermes). Agents read the SKILL.md file to understand how to invoke translation.

OpenCode

  1. Add the skill to your project:

    cp -r src/.opencode/skills/ol-localizer <your-project>/.opencode/skills/
    
  2. Reference it in your OpenCode configuration if needed

For detailed usage, see src/.opencode/skills/ol-localizer/SKILL.md

Hermes

  1. Copy or symlink the skill:

    cp -r src/.hermes/skills/ol-localizer ~/.hermes/skills/
    
  2. Restart Hermes to activate

For detailed usage, see src/.hermes/skills/ol-localizer/SKILL.md

Environment Variables

Configure your LLM provider API keys in your shell environment.

Testing the Agent Integration

Verify skill files exist:

ls src/.opencode/skills/ol-localizer/SKILL.md
ls src/.hermes/skills/ol-localizer/SKILL.md

Test JSON output (machine-readable for agents):

python -m ol_cli translate-md input.md -c config/default.yaml -s en -t zh -o output/ --json

Expected JSON output:

{"success": true, "input_file": "input.md", "output_file": "output/input.md", "source_lang": "en", "target_lang": "zh"}

Run skill tests:

pytest tests/test_opencode_skill.py tests/test_hermes_skill.py -v

Verify --json flag in help:

python -m ol_cli translate-md --help | grep json

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omni_localizer-0.1.0.tar.gz (88.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

omni_localizer-0.1.0-py3-none-any.whl (54.8 kB view details)

Uploaded Python 3

File details

Details for the file omni_localizer-0.1.0.tar.gz.

File metadata

  • Download URL: omni_localizer-0.1.0.tar.gz
  • Upload date:
  • Size: 88.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for omni_localizer-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ba05c7f81fc20c620b73fcfebc7205d979142b19a6ed91adcb272aa96eca6bc0
MD5 e2cfb29cc65c0fa817baea12b94fc424
BLAKE2b-256 61ee8eb780b81d955ad1e22cff78a356ca757926a19b2c85739a864d39b80802

See more details on using hashes here.

Provenance

The following attestation bundles were made for omni_localizer-0.1.0.tar.gz:

Publisher: publish.yml on 1StepMore/Omni_Localizer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file omni_localizer-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: omni_localizer-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 54.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for omni_localizer-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 aa6d424fde9090ca7711418a32d1a52a1644fd9c9f03dfbdafdcbc0c9c65610d
MD5 9251be2c64fd48efe4edcf05e0b361b1
BLAKE2b-256 a18d2a588af336ac1625c8191f5de54040f2c9c43b98d9a634653f9d91044579

See more details on using hashes here.

Provenance

The following attestation bundles were made for omni_localizer-0.1.0-py3-none-any.whl:

Publisher: publish.yml on 1StepMore/Omni_Localizer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page