A lazy yet bulletproof machine translation tool for Elastichsearch.
Project description
ES Translator
A lazy yet bulletproof machine translation tool for Elastichsearch.
Usage: es-translator [OPTIONS]
Options:
--url TEXT Elastichsearch URL [required]
--index TEXT Elastichsearch Index [required]
--source-language TEXT Source language to translate from [required]
--target-language TEXT Target language to translate to [required]
--intermediary-language TEXT An intermediary language to use when no
translation is available between the source
and the target. If none is provided this will
be calculated automatically.
--source-field TEXT Document field to translate
--target-field TEXT Document field where the translations are
stored
--query-string TEXT Search query string to filter result
--data-dir PATH Path to the directory where to language model
will be downloaded
--scan-scroll TEXT Scroll duration (set to higher value if you're
processing a lot of documents)
--dry-run Don't save anything in Elasticsearch
--pool-size INTEGER Number of parallel processes to start
--pool-timeout INTEGER Timeout to add a translation
--syslog-address TEXT Syslog address
--syslog-port INTEGER Syslog port
--syslog-facility TEXT Syslog facility
--stdout-loglevel TEXT Change the default log level for stdout error
handler
--help Show this message and exit.
Installation (Ubuntu)
Install Apertium:
wget https://apertium.projectjj.com/apt/install-nightly.sh -O - | sudo bash
sudo apt install apertium-all-dev
Create a Virtualenv and install Pip packages with Pipenv:
sudo apt install pipenv
make install
Installation (Docker)
Nothing to do as long as you have Docker on your system:
docker run -it icij/es-translator python es_translator.py --help
Examples
Translates documents from French to Spanish on a local Elasticsearch. The translated field is content
(the default).
python es_translator.py --url "http://localhost:9200" --index my-index --source-language fr --target-language es
To translate the title
field we could do:
pipenv shelllator.py --url "http://localhost:9200" --index my-index --source-language fr --target-language es --source-field title
Translates documents from English to Spanish on a local Elasticsearch using 4 threads:
python es_translator.py --url "http://localhost:9200" --index my-index --source-language en --target-language es --pool-size 4
Translates documents from Portuguese to English, using an intermediary language (Apertium doesn't offer this translation pair):
python es_translator.py --url "http://localhost:9200" --index my-index --source-language pt --intermediary-language es --target-language en
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for es_translator-1.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6700bfc004ee3a310b4f26fced90655c1dddfe177edbf91d4f6581580100c151 |
|
MD5 | 168384980d1fbdb081590e1245c4fb92 |
|
BLAKE2b-256 | 5cd6b1d48ea634135353efa401ef8912e68be622da1b35ba79a00e82df0cbd3f |