Translate from one language to another.
Project description
Interpres — Translator
Translate text and files between languages using Hugging Face translation models (default: Meta NLLB).
Interpres (Translator) is a lightweight CLI and Python package for fast, batch-friendly translation workflows. It supports single-sentence translation, directory and file translation, and robust PO (gettext) file handling for localization workflows.
Key features
- CLI + Python API
- Default model:
facebook/nllb-200-distilled-600M(configurable) - Fast batch translation with configurable batch size, epochs, and parallelism
- Full support for PO files: preserves metadata, comments, and structure; translates only untranslated entries by default; force retranslate option
- Language list compatible with NLLB (200+ languages)
Quick install
pip:
pip install interpres
Install from source:
pip install git+https://github.com/wasertech/Translator.git
Specify a release:
pip install interpres==0.3.1b4
pip install git+https://github.com/wasertech/Translator.git@v0.3.1b4
CLI overview
Run:
translate [FROM] [TO] [SENTENCES...]
Basic examples:
# Single sentence
translate en fr "This is a test."
# Interactive shell
translate
# or get help
translate --help
# Translate a directory and save output
translate --directory ./texts --save translations.txt eng_Latn fra_Latn
Important options (common)
- -m, --model_id MODEL_ID : Hugging Face model ID to use
- -d, --directory DIRECTORY : Translate files in a directory
- --po : Translate PO files (gettext)
- --force : Force retranslation (including already translated entries)
- -b, --batch_size : Batch size for model inference
- -e, --nepoch : Number of epoch splits used to pipeline batches (tweak to avoid OOM)
- -n, --nproc : Number of CPU workers for preprocessing/filtering
- -L, --language_list : Show supported languages
PO-file translation (high level)
- Finds .po files recursively and validates Language metadata
- By default translates empty msgstr entries only
- --force reprocesses every entry
- Preserves comments, headers, and file formatting — ideal for Poedit/Django workflows
Python API (simple)
from translator import Translator
t = Translator("eng_Latn", "fra_Latn")
out = t.translate("This is a simple sentence.")
print(out)
PO-file example (Python)
from translator import Translator, utils
translator = Translator("eng_Latn", "spa_Latn")
po = utils.read_po_file("messages.po")
if utils.should_translate_po_file(po, "eng_Latn"):
entries = utils.extract_untranslated_from_po(po)
translations = translator.translate(entries)
mapping = dict(zip(entries, translations))
utils.update_po_with_translations(po, mapping, force=False)
utils.save_po_file(po, "messages.po")
Language support
Depending on models used, you might get fewer choices
but with NLLB you get more than 200 most popular ones.
# translate -L
❯ translate --language_list
Language list:
...
From python:
>>> import translator
>>> len(translator.LANGS)
202
>>> translator.LANGS
['ace_Arab', '...', 'zul_Latn']
>>> from translator.language import get_nllb_lang, get_sys_lang_format
>>> nllb_lang = get_nllb_lang("en")
>>> nllb_lang
'eng_Latn'
>>> get_sys_lang_format()
'fra_Latn'
Checkout LANGS to see the full list of supported languages.
Custom models
- Use any Hugging Face translation model compatible with the Transformers pipeline:
translate --model_id "HUGGINGFACE/MODEL_ID" ... - Prefer models that match your language pair and domain. Models trained or fine-tuned specifically for a given language pair (e.g., en→fr) often produce noticeably better results than general multilingual models.
- Domain/context-specific models are best: if you're translating a website, a model trained on website or localization data (or fine-tuned on your site's content) will usually yield more accurate, consistent, and context-aware translations than the default general-purpose model.
Performance tips
- Set nepoch (-e) and batch_size (-b) to fit your device memory. Bigger batch_size speeds throughput but uses more memory.
- Use -n to match your CPU threads for preprocessing speed.
- Use custom models: choosing a language-pair-specific or domain-specific model (or fine-tuning one on your data) often improves translation quality and consistency, especially for specialized content such as legal texts, technical docs, or websites.
License
Mozilla Public License 2.0 — see LICENSE
Using this tool to translate a sentence, the licence of the original sentence still applies unless specified otherwise.
Meaning, if you translate a sentence under Creative Commons CC0, the translation is also under Creative Commons CC0.
Idem for any licence.
Contribute & sponsor
- Share and talk about translations models you like to use and tell us why.
- Open issues or PRs for features, bugfixes, or performance improvements.
- Sponsor this project
Thanks for building with Interpres — translate confidently, scale thoughtfully.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file interpres-0.4.0b5.tar.gz.
File metadata
- Download URL: interpres-0.4.0b5.tar.gz
- Upload date:
- Size: 24.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a80949c0a1531fa4e16457326892fd722a5e63fc00fad31419b11da72b63205
|
|
| MD5 |
6511565f6fd0815a85c12750a4c6e1d1
|
|
| BLAKE2b-256 |
41c5417ce4c81c71b8d9f7d9e12a5a0cda9bd2600eba29513534245a61b2b1e7
|
File details
Details for the file interpres-0.4.0b5-py3-none-any.whl.
File metadata
- Download URL: interpres-0.4.0b5-py3-none-any.whl
- Upload date:
- Size: 24.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2842038cbfce62089f5fbfd3240ce8cab8969c4a068118b1c77598f9114c3ccd
|
|
| MD5 |
1b96e59fbf39a9794baeb03545732323
|
|
| BLAKE2b-256 |
b54b2b5ace99ce3571a26a02771e4790fefc70fc312069977a5c5ba0257e1cde
|