Skip to main content

Transformer based translation quality estimation

Project description

License Downloads

TransQuest : Translation Quality Estimation with Cross-lingual Transformers.

TransQuest provides state-of-the-art models for translation quality estimation.

We are the winning solution in WMT 2020 Quality Estimation Shared Task - Sentence-Level Direct Assessment

Features

  • Sentence-level translation quality estimation on both aspects: predicting post editing efforts and direct assessment.
  • Perform significantly better than current state-of-the-art quality estimation methods like DeepQuest and OpenKiwi in all the languages experimented.
  • Pre-trained quality estimation models for seven languages.

Installation

You first need to install PyTorch. The recommended PyTorch version is 1.5. Please refer to PyTorch installation page regarding the specific install command for your platform.

When PyTorch has been installed, you can install TransQuest from source or from pip.

From Source

git clone https://github.com/TharinduDR/TransQuest.git
cd TransQuest
pip install -r requirements.txt

From pip

pip install transquest

Run the examples

Examples are included in the repository but are not shipped with the library.

  1. WMT 2020 Sentence-level Direct Assessment QE Shared Task
  2. WMT 2020 Sentence-level Post-Editing Effort QE Shared Task
  3. WMT 2019 Sentence-level Post-Editing Effort QE Shared Task
  4. WMT 2018 Sentence-level Post-Editing Effort QE Shared Task

TransQuest Model Zoo

Following pre-trained models are released. We will be keep releasing new models. Please keep in touch.

Language Pair Objective Algorithm Model Link Data Pearson MAE RMSE
Romanian-English (NMT) Direct TransQuest model.zip WMT 2020 0.8982 0.3121 0.4097
SiameseTransQuest model.zip WMT 2020 0.8501 0.3637 0.4932
OpenKiwi WMT 2020 0.6845 0.7596 1.0522
Estonian-English (NMT) Direct TransQuest model.zip WMT 2020 0.7748 0.5904 0.7321
SiameseTransQuest model.zip WMT 2020 0.6804 0.7047 0.9022
OpenKiwi WMT 2020 0.4770 0.9176 1.1382
Nepalese-English (NMT) Direct TransQuest model.zip WMT 2020 0.7914 0.3975 0.5078
SiameseTransQuest model.zip WMT 2020 0.6081 0.6531 0.7950
OpenKiwi WMT 2020 0.3860 0.7353 0.8713
Sinhala-English (NMT) Direct TransQuest model.zip WMT 2020 0.6525 0.4510 0.5570
SiameseTransQuest model.zip WMT 2020 0.5957 0.5078 0.6466
OpenKiwi WMT 2020 0.3737 0.7517 0.8978
Russian-English (NMT) Direct TransQuest model.zip WMT 2020 0.7734 0.5076 0.6856
SiameseTransQuest model.zip WMT 2020
OpenKiwi WMT 2020 0.5479 0.8253 1.1930
English-German (NMT) Direct TransQuest model.zip WMT 2020 0.4669 0.6474 0.7762
SiameseTransQuest model.zip WMT 2020
OpenKiwi WMT 2020 0.1455 0.6791 0.9670
HTER TransQuest model.zip WMT 2020 0.4994 0.1486 0.1842
SiameseTransQuest model.zip WMT 2020
OpenKiwi WMT 2020 0.3916 0.1500 0.1896
English-Chinese (NMT) Direct TransQuest model.zip WMT 2020 0.4779 0.9865 1.1338
SiameseTransQuest model.zip WMT 2020 0.4067 1.0389 1.1973
OpenKiwi WMT 2020 0.1676 0.6559 0.8503
HTER TransQuest model.zip WMT 2020 0.5910 0.1400 0.1717
SiameseTransQuest model.zip WMT 2020
OpenKiwi WMT 2020 0.5058 0.1470 0.1814
English-Latvian (SMT) HTER TransQuest model.zip WMT 2018 0.7141 0.1041 0.1420
SiameseTransQuest WMT 2018
Quest++ WMT 2018 0.3528 0.1554 0.1919
English-Latvian (NMT) HTER TransQuest model.zip WMT 2018 0.7450 0.1162 0.1601
SiameseTransQuest WMT 2018
Quest++ WMT 2018 0.4435 0.1625 0.2164
English-German (SMT) HTER TransQuest model.zip WMT 2018 0.7355 0.0967 0.1300
SiameseTransQuest WMT 2018
Quest++ WMT 2018 0.3653 0.1402 0.1772
English-Czech (SMT) HTER TransQuest model.zip WMT 2018 0.7150 0.1198 0.1611
SiameseTransQuest WMT 2018
Quest++ WMT 2018 0.3943 0.1651 0.2110
German-English (SMT) HTER TransQuest model.zip WMT 2018 0.7878 0.0934 0.1277
SiameseTransQuest WMT 2018
Quest++ WMT 2018 0.3323 0.1508 0.1928

Once downloading them and unzipping it, they can be loaded easily

model = QuestModel("xlmroberta", "path", num_labels=1,
                               use_cuda=torch.cuda.is_available(), args=transformer_config)
model = SiameseTransQuestModel("path")

Citations

Please consider citing us if you use the library.

@InProceedings{transquest:2020,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {TransQuest: Translation Quality Estimation with Cross-lingual Transformers},
booktitle = {Proceedings of the 28th International Conference on Computational Linguistics},
year = {2020}
}

The task specific paper on 2020 WMT sentence-level DA that won the first place in the competition.

@InProceedings{transquest:2020,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {TransQuest at WMT2020: Sentence-Level Direct Assessment},
booktitle = {Proceedings of the Fifth Conference on Machine Translation},
year = {2020}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

transquest-0.2.3.tar.gz (78.7 kB view details)

Uploaded Source

File details

Details for the file transquest-0.2.3.tar.gz.

File metadata

  • Download URL: transquest-0.2.3.tar.gz
  • Upload date:
  • Size: 78.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.0.0.post20200309 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.7.6

File hashes

Hashes for transquest-0.2.3.tar.gz
Algorithm Hash digest
SHA256 6a2ee192a5244feb44e07b807d54de0dd0d34b6634002155906a075f503108b9
MD5 6202ac1f62637658bd6b4a026fbed577
BLAKE2b-256 2d56ae8c2cfff0aa5786dacda895992ce58c7e029327283b90857b9673363f6e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page