Skip to main content

A tiny Computer-Assisted Translation tool

Project description

tinycat

Tiny Computer-Assisted translation tool. Created for endcoronavirus.org.

Installation

pip install tinycat

Example

Text you want to translate is in document.txt file:

This is the first paragrapth.

This is the second paragraph. Paragraphs have multiple sentences.

This is the third paragraph.
This is still the third paragraphs.
Paragraphs are divided by empty lines.

Run following command to generate translation from English to Polish:

python3 -m tinycat.cli translate --input-file document.txt --patch-file patch.txt --lang-in english --lang-out polish

Generated patch.txt:

----------------------------------------------------------------
This is the first paragrapth.

To jest pierwszy paragraf.
----------------------------------------------------------------

----------------------------------------------------------------
This is the second paragraph. Paragraphs have multiple sentences.

To jest drugi akapit. Akapity mają wiele zdań.
----------------------------------------------------------------

----------------------------------------------------------------
This is the third paragraph.
This is still the third paragraphs.
Paragraphs are divided by empty lines.

To jest trzeci akapit. To wciąż trzeci akapit. Akapity są podzielone pustymi wierszami.
----------------------------------------------------------------

Modify patch.txt and save it to patch_corrected.txt. In this case we corrected paragraf to akapit in the first sentence, and in third paragraph put each sentence in the new line for readability.

Apply the patch using:

python3 -m tinycat.cli update --patch-file patch_corrected.txt --dict-file en-pl.dict

Now the text can be translated as following (note that we are passing en-pl.dict:

python3 -m tinycat.cli translate --input-file document.txt --patch-file translated.txt --dict-file en-pl.dict --lang-in english --lang-out polish

translated.txt file does not contain any paragraphs to translate, as all the translations are taken from en-pl.dict dictionary. Content of translated.txt is:

To jest pierwszy akapit.

To jest drugi akapit. Akapity mają wiele zdań.

To jest trzeci akapit.
To wciąż trzeci akapit.
Akapity są podzielone pustymi wierszami.

In the next step we modify document.txt to add new paragraph, and correct typo in first sentence:

This is the first paragraph.

This is the second paragraph. Paragraphs have multiple sentences.

This is the third paragraph.
This is still the third paragraphs.
Paragraphs are divided by empty lines.

Final paragraph.

Let's create new patch for human translator:

python3 -m tinycat.cli translate --input-file document.txt --patch-file patch-2.txt --dict-file en-pl.dict --lang-in english --lang-out polish

patch-2.txt contains new translations that has to be checked:

----------------------------------------------------------------
This is the first paragraph.

To jest pierwszy akapit.
----------------------------------------------------------------

To jest drugi akapit. Akapity mają wiele zdań.

To jest trzeci akapit.
To wciąż trzeci akapit.
Akapity są podzielone pustymi wierszami.

----------------------------------------------------------------
Final paragraph.

Ostatni akapit.
----------------------------------------------------------------

This time all is fine so we can apply this patch:

python3 -m tinycat.cli update --patch-file patch-2.txt --dict-file en-pl.dict

And see final translated text with (if we don't pass --patch-file it will print to the console):

python3 -m tinycat.cli translate --input-file document.txt --dict-file en-pl.dict --lang-in english --lang-out polish

Help

python3 -m tinycat.cli translate --help
python3 -m tinycat.cli update --help

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tinycat-0.1.1.tar.gz (3.7 kB view hashes)

Uploaded Source

Built Distribution

tinycat-0.1.1-py3-none-any.whl (4.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page