A lazy yet bulletproof machine translation tool for Elastichsearch.
Project description
ES Translator
A lazy yet bulletproof machine translation tool for Elastichsearch.
Usage: es-translator [OPTIONS]
Options:
-u, --url TEXT Elastichsearch URL
-i, --index TEXT Elastichsearch Index [required]
-r, --interpreter TEXT Interpreter to use to perform the
translation
-s, --source-language TEXT Source language to translate from
[required]
-t, --target-language TEXT Target language to translate to [required]
--intermediary-language TEXT An intermediary language to use when no
translation is available between the source
and the target. If none is provided this
will be calculated automatically.
--source-field TEXT Document field to translate
--target-field TEXT Document field where the translations are
stored
-q, --query-string TEXT Search query string to filter result
-d, --data-dir PATH Path to the directory where to language
model will be downloaded
--scan-scroll TEXT Scroll duration (set to higher value if
you're processing a lot of documents)
--dry-run Don't save anything in Elasticsearch
--pool-size INTEGER Number of parallel processes to start
--pool-timeout INTEGER Timeout to add a translation
--throttle INTEGER Throttle between each translation (in ms)
--syslog-address TEXT Syslog address
--syslog-port INTEGER Syslog port
--syslog-facility TEXT Syslog facility
--stdout-loglevel TEXT Change the default log level for stdout
error handler
--progressbar / --no-progressbar
Display a progressbar
--help Show this message and exit.
Installation (Ubuntu)
Install Apertium:
wget https://apertium.projectjj.com/apt/install-nightly.sh -O - | sudo bash
sudo apt install apertium-all-dev
Create a Virtualenv and install Pip packages with Poetry:
make install
On Ubuntu 22.04 some additional packages might be needed if you use the version from Ubuntu's repository:
sudo apt install cg3 apertium-get apertium-lex-tools
Installation (Docker)
Nothing to do as long as you have Docker on your system:
docker run -it icij/es-translator poetry run es-translator --help
Examples
Translates documents from French to Spanish on a local Elasticsearch. The translated field is content
(the default).
poetry run es-translator --url "http://localhost:9200" --index my-index --source-language fr --target-language es
Translates documents from French to English on a local Elasticsearch using Apertium:
poetry run es-translator --url "http://localhost:9200" --index my-index --source-language fr --target-language en --interpreter apertium
To translate the title
field we could do:
poetry run es-translator --url "http://localhost:9200" --index my-index --source-language fr --target-language es --source-field title
Translates documents from English to Spanish on a local Elasticsearch using 4 threads:
poetry run es-translator --url "http://localhost:9200" --index my-index --source-language en --target-language es --pool-size 4
Translates documents from Portuguese to English, using an intermediary language (Apertium doesn't offer this translation pair):
poetry run es-translator --url "http://localhost:9200" --index my-index --source-language pt --intermediary-language es --target-language en
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for es_translator-1.5.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 37c3382c97fdd8ba7be1d21832fe92fa52085abaa25d655ddabfd81b18f6fc08 |
|
MD5 | 43789710473600125cc60c4bdc556ea4 |
|
BLAKE2b-256 | 62719594c6de2078ec9fed9a6aab261f37ffce4977983a79d62859480e7cccdd |