TEA - Translation Engine Architect
Project description
TEA - Translation Engine Architect
A command line tool to create translation engine.
Install
First install pipx then (x being your python version):
pipx install pangeamt-tea
Usage
Step 1: Create a new project
tea new --customer customer --src_lang es --tgt_lang en --flavor automotion --version 2
This command will create the project directory structure:
├── customer_es_en_automotion_2
│ ├── config.yml
│ └── data
Then enter in the directory
cd customer_es_en_automotion_2
Step 2: Configuration
Tokenizer
A tokenizer can be applied to source and target
tea config tokenizer --src mecab --tgt moses
To list all available tokenizer:
tea config tokenizer --help
Truecaser
tea config truecaser --src --tgt
BPE
tea config bpe -j
Processors
tea config processors -s "{processors}"
being processors a list of preprocesses and postprocesses.
To list all available tokenizer:
tea config processors --list
Config prepare
tea config prepare --shard_size 100000 --src_seq_length 400 --tgt_seq_length 400
Condif model
tea config translation-model -n onmt
Step 3:
Copy some multilingual ressources (.tmx, bilingual files, .af ) into the 'data' directory
Step 4: Run
Create workflow
tea worflow new
Clean the data passing the normalizers and validators:
tea workflow clean -n {clean_th} -d
being clean_th the number of threads.
Preprocess the data (split data in train, dev or test, tokenization, BPE):
tea workflow prepare -n {prepare_th} -s 3
being prepare_th the number of threads.
Training model
tea workflow train --gpu 0
Evaluate model
tea workflow eval --step {step} --src file.src --ref file.tgt --log file.log --out file.out --gpu 0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pangeamt-tea-0.2.32.tar.gz
.
File metadata
- Download URL: pangeamt-tea-0.2.32.tar.gz
- Upload date:
- Size: 18.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.22.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.30.0 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5a5f05f8b9c29338c5b182ebf490298f9f8171bc5f63a5f9acae608e4c9c4ef1 |
|
MD5 | 217df3856a80be2761061715e8fdb65f |
|
BLAKE2b-256 | ef247aa5a31181a71e744bde8a735a9ae8c0388c26a5a303ea09de2ecc8c431c |
File details
Details for the file pangeamt_tea-0.2.32-py3-none-any.whl
.
File metadata
- Download URL: pangeamt_tea-0.2.32-py3-none-any.whl
- Upload date:
- Size: 24.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.22.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.30.0 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6cf29f4ed7fb8e721e93a351a5cfde77dc5f7b0a25588e17d5ac0df259a56e3c |
|
MD5 | 5540e01c39873c913b3d13437cd283cb |
|
BLAKE2b-256 | b6059e331526a034e5c7e76b5c98dde7708c246a46621cd7695d6254f5f5cb86 |