Skip to main content

TEA - Translation Engine Architect

Project description

TEA - Translation Engine Architect

A command line tool to create translation engine.

Install

First install pipx then:

pipx install pangeamt-tea

Usage

Step 1: Create a new project

tea new --customer customer --srcLang es --tgtLang en --flavor automotion --version 0.0.1

This command will create the project directory structure:

├── customer_es_en_automotion_0.0.1
│   ├── config.yml
│   └── data

Then enter in the directory

cd customer_es_en_automotion_0.0.1

Step 2: Configuration

Tokenizer

A tokenizer can be applied to source and target

tea tokenizer --src mecab  --tgt moses

To list all available tokenizer:

tea tokenizer --list 

Truecaser

tea truecaser --src --tgt

BPE

tea bpe -s -t

Processors

tea config processors -s "{processors}"

being processors a list of preprocesses and postprocesses.

Step 3:

Copy some multilingual ressources (.tmx, bilingual files, .af ) into the 'data' directory

Step 4: Run

Clean the data passing the normalizers and validators:

tea workflow clean -n {clean_th} -d

being clean_th the number of threads.

Preprocess the data (split data in train, dev or test, tokenization, BPE):

tea workflow prepare -n {prepare_th} -s 3

being prepare_th the number of threads.

Training model

tea workflow train --gpu 0

Evaluate model

tea workflow eval --step {step} --src file.src --ref file.tgt --log file.log --out file.out --gpu 0

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pangeamt-tea-0.2.24.tar.gz (18.6 kB view details)

Uploaded Source

Built Distribution

pangeamt_tea-0.2.24-py3-none-any.whl (24.5 kB view details)

Uploaded Python 3

File details

Details for the file pangeamt-tea-0.2.24.tar.gz.

File metadata

  • Download URL: pangeamt-tea-0.2.24.tar.gz
  • Upload date:
  • Size: 18.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0.post20200925 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.8.5

File hashes

Hashes for pangeamt-tea-0.2.24.tar.gz
Algorithm Hash digest
SHA256 9d6ff8bcecace4241a4f3b24774882aa22bb5e8162c8a7851d6aaa336c51d3d1
MD5 3bab08acdcb4dcd88a5e0f6b3ab4427a
BLAKE2b-256 51096fc38bb971245304c493b0ef8bacc372fc222dc313a83400ce69cbdce6df

See more details on using hashes here.

File details

Details for the file pangeamt_tea-0.2.24-py3-none-any.whl.

File metadata

  • Download URL: pangeamt_tea-0.2.24-py3-none-any.whl
  • Upload date:
  • Size: 24.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0.post20200925 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.8.5

File hashes

Hashes for pangeamt_tea-0.2.24-py3-none-any.whl
Algorithm Hash digest
SHA256 f60fb691f1d43c33e9da244d869d323e04d509a5685c67af5bd67ef5a6dfa3dc
MD5 0aafb0d88a03f7b7627708bde499e3cb
BLAKE2b-256 cb372c25313103edd9364f671ce15175bef48f8152bdde51435314cf8e58b863

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page