Transformer based translation quality estimation

These details have not been verified by PyPI

Project links

Homepage

Project description

TransQuest : Translation Quality Estimation with Cross-lingual Transformers.

TransQuest provides state-of-the-art models for translation quality estimation.

We are the winning solution in WMT 2020 Quality Estimation Shared Task - Sentence-Level Direct Assessment

Features

Sentence-level translation quality estimation on both aspects: predicting post editing efforts and direct assessment.
Perform significantly better than current state-of-the-art quality estimation methods like DeepQuest and OpenKiwi in all the languages experimented.
Pre-trained quality estimation models for seven languages.

Installation

You first need to install PyTorch. The recommended PyTorch version is 1.5. Please refer to PyTorch installation page regarding the specific install command for your platform.

When PyTorch has been installed, you can install TransQuest from source or from pip.

From Source

git clone https://github.com/TharinduDR/TransQuest.git
cd TransQuest
pip install -r requirements.txt

From pip

pip install transquest

Run the examples

Examples are included in the repository but are not shipped with the library.

TransQuest Model Zoo

Following pre-trained models are released. We will be keep releasing new models. Please keep in touch.

Language Pair	Objective	Algorithm	Model Link	Data	Pearson	MAE	RMSE
Romanian-English (NMT)	Direct	TransQuest	model.zip	WMT 2020	0.8982	0.3121	0.4097
		SiameseTransQuest	model.zip	WMT 2020	0.8501	0.3637	0.4932
		OpenKiwi		WMT 2020	0.6845	0.7596	1.0522
Estonian-English (NMT)	Direct	TransQuest	model.zip	WMT 2020	0.7748	0.5904	0.7321
		SiameseTransQuest	model.zip	WMT 2020	0.6804	0.7047	0.9022
		OpenKiwi		WMT 2020	0.4770	0.9176	1.1382
Nepalese-English (NMT)	Direct	TransQuest	model.zip	WMT 2020	0.7914	0.3975	0.5078
		SiameseTransQuest	model.zip	WMT 2020	0.6081	0.6531	0.7950
		OpenKiwi		WMT 2020	0.3860	0.7353	0.8713
Sinhala-English (NMT)	Direct	TransQuest	model.zip	WMT 2020	0.6525	0.4510	0.5570
		SiameseTransQuest	model.zip	WMT 2020	0.5957	0.5078	0.6466
		OpenKiwi		WMT 2020	0.3737	0.7517	0.8978
Russian-English (NMT)	Direct	TransQuest	model.zip	WMT 2020	0.7734	0.5076	0.6856
		SiameseTransQuest	model.zip	WMT 2020
		OpenKiwi		WMT 2020	0.5479	0.8253	1.1930
English-German (NMT)	Direct	TransQuest	model.zip	WMT 2020	0.4669	0.6474	0.7762
		SiameseTransQuest	model.zip	WMT 2020
		OpenKiwi		WMT 2020	0.1455	0.6791	0.9670
	HTER	TransQuest	model.zip	WMT 2020	0.4994	0.1486	0.1842
		SiameseTransQuest	model.zip	WMT 2020
		OpenKiwi		WMT 2020	0.3916	0.1500	0.1896
English-Chinese (NMT)	Direct	TransQuest	model.zip	WMT 2020	0.4779	0.9865	1.1338
		SiameseTransQuest	model.zip	WMT 2020	0.4067	1.0389	1.1973
		OpenKiwi		WMT 2020	0.1676	0.6559	0.8503
	HTER	TransQuest	model.zip	WMT 2020	0.5910	0.1400	0.1717
		SiameseTransQuest	model.zip	WMT 2020
		OpenKiwi		WMT 2020	0.5058	0.1470	0.1814
English-Latvian (SMT)	HTER	TransQuest	model.zip	WMT 2018	0.7141	0.1041	0.1420
		SiameseTransQuest		WMT 2018
		Quest++		WMT 2018	0.3528	0.1554	0.1919
English-Latvian (NMT)	HTER	TransQuest	model.zip	WMT 2018	0.7450	0.1162	0.1601
		SiameseTransQuest		WMT 2018
		Quest++		WMT 2018	0.4435	0.1625	0.2164
English-German (SMT)	HTER	TransQuest	model.zip	WMT 2018	0.7355	0.0967	0.1300
		SiameseTransQuest		WMT 2018
		Quest++		WMT 2018	0.3653	0.1402	0.1772
English-Czech (SMT)	HTER	TransQuest	model.zip	WMT 2018	0.7150	0.1198	0.1611
		SiameseTransQuest		WMT 2018
		Quest++		WMT 2018	0.3943	0.1651	0.2110
German-English (SMT)	HTER	TransQuest	model.zip	WMT 2018	0.7878	0.0934	0.1277
		SiameseTransQuest		WMT 2018
		Quest++		WMT 2018	0.3323	0.1508	0.1928

Once downloading them and unzipping it, they can be loaded easily

model = QuestModel("xlmroberta", "path", num_labels=1,
                               use_cuda=torch.cuda.is_available(), args=transformer_config)

model = SiameseTransQuestModel("path")

Citations

Please consider citing us if you use the library.

@InProceedings{transquest:2020,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {TransQuest: Translation Quality Estimation with Cross-lingual Transformers},
booktitle = {Proceedings of the 28th International Conference on Computational Linguistics},
year = {2020}
}

The task specific paper on 2020 WMT sentence-level DA that won the first place in the competition.

@InProceedings{transquest:2020,
author = {Ranasinghe, Tharindu and Orasan, Constantin and Mitkov, Ruslan},
title = {TransQuest at WMT2020: Sentence-Level Direct Assessment},
booktitle = {Proceedings of the Fifth Conference on Machine Translation},
year = {2020}
}

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.1.1

Apr 26, 2021

1.1.0

Apr 26, 2021

1.0.2

Mar 23, 2021

1.0.2b0 pre-release

Mar 19, 2021

1.0.1b0 pre-release

Mar 19, 2021

1.0.0

Mar 20, 2021

1.0.0b0 pre-release

Mar 19, 2021

0.2.5

Oct 9, 2020

0.2.4

Oct 9, 2020

This version

0.2.3

Oct 9, 2020

0.2.2

Jul 15, 2020

0.2.1

Jul 1, 2020

0.2.0

Jun 8, 2020

0.1.0

Apr 9, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

transquest-0.2.3.tar.gz (78.7 kB view hashes)

Uploaded Oct 9, 2020 Source

Hashes for transquest-0.2.3.tar.gz

Hashes for transquest-0.2.3.tar.gz
Algorithm	Hash digest
SHA256	`6a2ee192a5244feb44e07b807d54de0dd0d34b6634002155906a075f503108b9`
MD5	`6202ac1f62637658bd6b4a026fbed577`
BLAKE2b-256	`2d56ae8c2cfff0aa5786dacda895992ce58c7e029327283b90857b9673363f6e`