A lazy yet bulletproof machine translation tool for Elastichsearch.
Project description
ES Translator
A lazy yet bulletproof machine translation tool for Elastichsearch.
Usage: es-translator [OPTIONS]
Options:
--url TEXT Elastichsearch URL [required]
--index TEXT Elastichsearch Index [required]
--interpreter TEXT Interpreter to use to perform the translation
--source-language TEXT Source language to translate from [required]
--target-language TEXT Target language to translate to [required]
--intermediary-language TEXT An intermediary language to use when no
translation is available between the source
and the target. If none is provided this will
be calculated automatically.
--source-field TEXT Document field to translate
--target-field TEXT Document field where the translations are
stored
--query-string TEXT Search query string to filter result
--data-dir PATH Path to the directory where to language model
will be downloaded
--scan-scroll TEXT Scroll duration (set to higher value if you're
processing a lot of documents)
--dry-run Don't save anything in Elasticsearch
--pool-size INTEGER Number of parallel processes to start
--pool-timeout INTEGER Timeout to add a translation
--syslog-address TEXT Syslog address
--syslog-port INTEGER Syslog port
--syslog-facility TEXT Syslog facility
--stdout-loglevel TEXT Change the default log level for stdout error
handler
--help Show this message and exit.
Installation (Ubuntu)
Install Apertium:
wget https://apertium.projectjj.com/apt/install-nightly.sh -O - | sudo bash
sudo apt install apertium-all-dev
Create a Virtualenv and install Pip packages with Poetry:
make install
On Ubuntu 22.04 some additional packages might be needed if you use the version from Ubuntu's repository:
sudo apt install cg3 apertium-get apertium-lex-tools
Installation (Docker)
Nothing to do as long as you have Docker on your system:
docker run -it icij/es-translator poetry run es-translator --help
Examples
Translates documents from French to Spanish on a local Elasticsearch. The translated field is content
(the default).
poetry run es-translator --url "http://localhost:9200" --index my-index --source-language fr --target-language es
Translates documents from French to English on a local Elasticsearch using Apertium:
poetry run es-translator --url "http://localhost:9200" --index my-index --source-language fr --target-language en --interpreter apertium
To translate the title
field we could do:
poetry run es-translator --url "http://localhost:9200" --index my-index --source-language fr --target-language es --source-field title
Translates documents from English to Spanish on a local Elasticsearch using 4 threads:
poetry run es-translator --url "http://localhost:9200" --index my-index --source-language en --target-language es --pool-size 4
Translates documents from Portuguese to English, using an intermediary language (Apertium doesn't offer this translation pair):
poetry run es-translator --url "http://localhost:9200" --index my-index --source-language pt --intermediary-language es --target-language en
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for es_translator-1.4.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 14d2604dd39535dad10609ca251eeece0f1412bfede73da3c348738c327d253a |
|
MD5 | 1056c19da211301fd86b50ccee8618c5 |
|
BLAKE2b-256 | 3795126d8d2e9ab2038172a377b75dd059a232f87ae3284e7336f1315debdba2 |