Skip to main content

Translate from one language to another.

Project description

Interpres — Translator

Translate text and files between languages using Hugging Face translation models (default: Meta NLLB).

Interpres (Translator) is a lightweight CLI and Python package for fast, batch-friendly translation workflows. It supports single-sentence translation, directory and file translation, and robust PO (gettext) file handling for localization workflows.

Key features

  • CLI + Python API
  • Default model: facebook/nllb-200-distilled-600M (configurable)
  • Fast batch translation with configurable batch size, epochs, and parallelism
  • Full support for PO files: preserves metadata, comments, and structure; translates only untranslated entries by default; force retranslate option
  • Language list compatible with NLLB (200+ languages)

Quick install

pip:

pip install interpres

Install from source:

pip install git+https://github.com/wasertech/Translator.git

Specify a release:

pip install interpres==0.3.1b4
pip install git+https://github.com/wasertech/Translator.git@v0.3.1b4

CLI overview

Run:

translate [FROM] [TO] [SENTENCES...]

Basic examples:

# Single sentence
translate en fr "This is a test."

# Interactive shell
translate
# or get help
translate --help

# Translate a directory and save output
translate --directory ./texts --save translations.txt eng_Latn fra_Latn

Important options (common)

  • -m, --model_id MODEL_ID : Hugging Face model ID to use
  • -d, --directory DIRECTORY : Translate files in a directory
  • --po : Translate PO files (gettext)
  • --force : Force retranslation (including already translated entries)
  • -b, --batch_size : Batch size for model inference
  • -e, --nepoch : Number of epoch splits used to pipeline batches (tweak to avoid OOM)
  • -n, --nproc : Number of CPU workers for preprocessing/filtering
  • -L, --language_list : Show supported languages

PO-file translation (high level)

  • Finds .po files recursively and validates Language metadata
  • By default translates empty msgstr entries only
  • --force reprocesses every entry
  • Preserves comments, headers, and file formatting — ideal for Poedit/Django workflows

Python API (simple)

from translator import Translator

t = Translator("eng_Latn", "fra_Latn")
out = t.translate("This is a simple sentence.")
print(out)

PO-file example (Python)

from translator import Translator, utils

translator = Translator("eng_Latn", "spa_Latn")
po = utils.read_po_file("messages.po")

if utils.should_translate_po_file(po, "eng_Latn"):
    entries = utils.extract_untranslated_from_po(po)
    translations = translator.translate(entries)
    mapping = dict(zip(entries, translations))
    utils.update_po_with_translations(po, mapping, force=False)
    utils.save_po_file(po, "messages.po")

Language support

Depending on models used, you might get fewer choices but with NLLB you get more than 200 most popular ones.

# translate -L translate --language_list
Language list:
    ...

From python:

>>> import translator
>>> len(translator.LANGS)
202
>>> translator.LANGS
['ace_Arab', '...', 'zul_Latn']
>>> from translator.language import get_nllb_lang, get_sys_lang_format
>>> nllb_lang = get_nllb_lang("en")
>>> nllb_lang
'eng_Latn'
>>> get_sys_lang_format()
'fra_Latn'

Checkout LANGS to see the full list of supported languages.

Custom models

  • Use any Hugging Face translation model compatible with the Transformers pipeline: translate --model_id "HUGGINGFACE/MODEL_ID" ...
  • Prefer models that match your language pair and domain. Models trained or fine-tuned specifically for a given language pair (e.g., en→fr) often produce noticeably better results than general multilingual models.
  • Domain/context-specific models are best: if you're translating a website, a model trained on website or localization data (or fine-tuned on your site's content) will usually yield more accurate, consistent, and context-aware translations than the default general-purpose model.

Performance tips

  • Set nepoch (-e) and batch_size (-b) to fit your device memory. Bigger batch_size speeds throughput but uses more memory.
  • Use -n to match your CPU threads for preprocessing speed.
  • Use custom models: choosing a language-pair-specific or domain-specific model (or fine-tuning one on your data) often improves translation quality and consistency, especially for specialized content such as legal texts, technical docs, or websites.

License

Mozilla Public License 2.0 — see LICENSE

Using this tool to translate a sentence, the licence of the original sentence still applies unless specified otherwise.

Meaning, if you translate a sentence under Creative Commons CC0, the translation is also under Creative Commons CC0.

Idem for any licence.

Contribute & sponsor

Thanks for building with Interpres — translate confidently, scale thoughtfully.

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

interpres-0.4.0b5.tar.gz (24.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

interpres-0.4.0b5-py3-none-any.whl (24.5 kB view details)

Uploaded Python 3

File details

Details for the file interpres-0.4.0b5.tar.gz.

File metadata

  • Download URL: interpres-0.4.0b5.tar.gz
  • Upload date:
  • Size: 24.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for interpres-0.4.0b5.tar.gz
Algorithm Hash digest
SHA256 0a80949c0a1531fa4e16457326892fd722a5e63fc00fad31419b11da72b63205
MD5 6511565f6fd0815a85c12750a4c6e1d1
BLAKE2b-256 41c5417ce4c81c71b8d9f7d9e12a5a0cda9bd2600eba29513534245a61b2b1e7

See more details on using hashes here.

File details

Details for the file interpres-0.4.0b5-py3-none-any.whl.

File metadata

  • Download URL: interpres-0.4.0b5-py3-none-any.whl
  • Upload date:
  • Size: 24.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for interpres-0.4.0b5-py3-none-any.whl
Algorithm Hash digest
SHA256 2842038cbfce62089f5fbfd3240ce8cab8969c4a068118b1c77598f9114c3ccd
MD5 1b96e59fbf39a9794baeb03545732323
BLAKE2b-256 b54b2b5ace99ce3571a26a02771e4790fefc70fc312069977a5c5ba0257e1cde

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page