High-quality Machine Translation Evaluation
Project description
Quick Installation
Detailed usage examples and instructions can be found in the Full Documentation.
Simple installation from PyPI
Pre-release of version 1.0:
pip install unbabel-comet==1.0.0rc2
To develop locally install Poetry and run the following commands:
git clone https://github.com/Unbabel/COMET
poetry install
Scoring MT outputs:
Via Bash:
Examples from WMT20:
echo -e "Dem Feuer konnte Einhalt geboten werden\nSchulen und Kindergärten wurden eröffnet." >> src.de
echo -e "The fire could be stopped\nSchools and kindergartens were open" >> hyp.en
echo -e "They were able to control the fire.\nSchools and kindergartens opened" >> ref.en
comet-score -s src.de -t hyp.en -r ref.en
You can select another model/metric with the --model flag and for reference-free (QE-as-a-metric) models you don't need to pass a reference.
comet-score -s src.de -t hyp.en -r ref.en --model wmt21-comet-qe-da
Following the work on Uncertainty-Aware MT Evaluation you can use the --mc_dropout flag to get a variance/uncertainty value for each segment score. If this value is high, it means that the metric as less confidence is that prediction.
comet-score -s src.de -t hyp.en -r ref.en --mc_dropout 100
Languages Covered:
All the above mentioned models are build on top of XLM-R which cover the following languages:
Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Basque, Belarusian, Bengali, Bengali Romanized, Bosnian, Breton, Bulgarian, Burmese, Burmese, Catalan, Chinese (Simplified), Chinese (Traditional), Croatian, Czech, Danish, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Hausa, Hebrew, Hindi, Hindi Romanized, Hungarian, Icelandic, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Kurdish (Kurmanji), Kyrgyz, Lao, Latin, Latvian, Lithuanian, Macedonian, Malagasy, Malay, Malayalam, Marathi, Mongolian, Nepali, Norwegian, Oriya, Oromo, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Sanskri, Scottish, Gaelic, Serbian, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tamil, Tamil Romanized, Telugu, Telugu Romanized, Thai, Turkish, Ukrainian, Urdu, Urdu Romanized, Uyghur, Uzbek, Vietnamese, Welsh, Western, Frisian, Xhosa, Yiddish.
Thus, results for language pairs containing uncovered languages are unreliable!
Scoring within Python:
COMET implements the Pytorch-Lightning model interface which means that you'll need to initialize a trainer in order to run inference.
import torch
from comet import download_model, load_from_checkpoint
from pytorch_lightning.trainer.trainer import Trainer
from torch.utils.data import DataLoader
model = load_from_checkpoint(
download_model("wmt20-comet-da")
)
data = [
{
"src": "Dem Feuer konnte Einhalt geboten werden",
"mt": "The fire could be stopped",
"ref": "They were able to control the fire."
},
{
"src": "Schulen und Kindergärten wurden eröffnet.",
"mt": "Schools and kindergartens were open",
"ref": "Schools and kindergartens opened"
}
]
data = [dict(zip(data, t)) for t in zip(*data.values())]
dataloader = DataLoader(
dataset=data,
batch_size=16,
collate_fn=lambda x: model.prepare_sample(x, inference=True),
num_workers=4,
)
trainer = Trainer(gpus=1, deterministic=True, logger=False)
predictions = trainer.predict(
model, dataloaders=dataloader, return_predictions=True
)
predictions = torch.cat(predictions, dim=0).tolist()
Note: Using the python interface you will get a list of segment-level scores. You can obtain the corpus-level score by averaging the segment-level scores
Model Zoo:
:TODO: Update model zoo after the shared task.
Model | Description |
---|---|
wmt20-comet-da |
DEFAULT: Regression model build on top of XLM-R (large) trained on DA from WMT17, to WMT19. This model was presented at the WMT20 Metrics shared task: rei et al, 2020. Same as wmt-large-da-estimator-1719 from previous versions. |
emnlp20-comet-rank |
Translation Ranking model build on top of XLM-R (base) trained with DARR from WMT17 and WMT18. This model was presented at EMNLP20: rei et al, 2020. |
wmt21-comet-da |
Regression model build on top of XLM-R (large) trained on DA from WMT15, to WMT20. This model was presented at the WMT21 Metrics shared task. |
wmt21-comet-mqm |
Regression model build on top of XLM-R (large) trained to maximize correlation with MQM annotations from freitag et al, 2020. |
QE-as-a-metric:
The following models can be used to assess translation quality without the need of references!
Model | Description |
---|---|
wmt21-comet-qe-da |
Reference-free Regression model build on top of XLM-R (large) trained on DA from WMT15, to WMT20. This model was presented at the WMT21 Metrics shared task. |
wmt21-comet-qe-mqm |
Reference-free Regression model build on top of XLM-R (large) trained to maximize correlation with MQM annotations from freitag et al, 2020. |
Lightweight models:
One of the remaining redeeming qualities of automated metrics such as BLEU is that they are incredibly lightweight. For that reason we have been developing COMETinho's, lightweight versions of the previous models.
Model | Description |
---|---|
wmt21-cometinho-da |
Regression model build on top of XLM-R (large) trained on DA from WMT15, to WMT20. This model was presented at the WMT21 Metrics shared task. |
wmt21-cometinho-mqm |
Regression model build on top of XLM-R (large) trained to maximize correlation with MQM annotations from freitag et al, 2020. |
Train your own Metric:
Instead of using pretrained models your can train your own model with the following command:
comet-train -cfg configs/models/{your_model_config}.yaml
Tensorboard:
Launch tensorboard with:
tensorboard --logdir="lightning_logs/"
unittest:
In order to run the toolkit tests you must run the following command:
coverage run --source=comet -m unittest discover
coverage report -m
Publications
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file unbabel-comet-1.0.0rc2.tar.gz
.
File metadata
- Download URL: unbabel-comet-1.0.0rc2.tar.gz
- Upload date:
- Size: 27.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 621376a3ae1c2951e8f218e4a55c54101eda934ef9bc4a231d007fdfaaaf9d4d |
|
MD5 | f0e3d7cc9bb43e9f4980bae07e08153b |
|
BLAKE2b-256 | df7d2ef7e80412940df44361ab8871217a72ab2328cfdea693d2901836437616 |
Provenance
File details
Details for the file unbabel_comet-1.0.0rc2-py3-none-any.whl
.
File metadata
- Download URL: unbabel_comet-1.0.0rc2-py3-none-any.whl
- Upload date:
- Size: 43.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ce268e65baa1381b169525577ca390b63f79495f3752de6200e55a6d9009f88a |
|
MD5 | d9de83828a8bfbbaa6b2943f37b51c23 |
|
BLAKE2b-256 | f9b2179f17e83ad6af9d7123c0c2c9aef2e9c6952c077bd2acbbb14700cdd1d4 |