High-quality Machine Translation Evaluation
Project description
Quick Installation
Detailed usage examples and instructions can be found in the Full Documentation.
Simple installation from PyPI
pip install unbabel-comet
To develop locally install Poetry and run the following commands:
git clone https://github.com/Unbabel/COMET
poetry install
Scoring MT outputs:
Via Bash:
Examples from WMT20:
echo -e "Dem Feuer konnte Einhalt geboten werden\nSchulen und Kindergärten wurden eröffnet." >> src.de
echo -e "The fire could be stopped\nSchools and kindergartens were open" >> hyp.en
echo -e "They were able to control the fire.\nSchools and kindergartens opened" >> ref.en
comet score -s src.de -h hyp.en -r ref.en
You can export your results to a JSON file using the --to_json
flag and select another model/metric with --model
.
comet score -s src.de -h hyp.en -r ref.en --model wmt-large-hter-estimator --to_json segments.json
Via Python:
from comet.models import download_model
model = download_model("wmt-large-da-estimator-1719")
data = [
{
"src": "Dem Feuer konnte Einhalt geboten werden",
"mt": "The fire could be stopped",
"ref": "They were able to control the fire."
},
{
"src": "Schulen und Kindergärten wurden eröffnet.",
"mt": "Schools and kindergartens were open",
"ref": "Schools and kindergartens opened"
}
]
model.predict(data, cuda=True, show_progress=True)
Simple Pythonic way to convert list or segments to model inputs:
source = ["Dem Feuer konnte Einhalt geboten werden", "Schulen und Kindergärten wurden eröffnet."]
hypothesis = ["The fire could be stopped", "Schools and kindergartens were open"]
reference = ["They were able to control the fire.", "Schools and kindergartens opened"]
data = {"src": source, "mt": hypothesis, "ref": reference}
data = [dict(zip(data, t)) for t in zip(*data.values())]
model.predict(data, cuda=True, show_progress=True)
Note: Using the python interface you will get a list of segment-level scores. You can obtain the corpus-level score by averaging the segment-level scores
Model Zoo:
Model | Description |
---|---|
↑wmt-large-da-estimator-1719 |
RECOMMENDED: Estimator model build on top of XLM-R (large) trained on DA from WMT17, WMT18 and WMT19 |
↑wmt-base-da-estimator-1719 |
Estimator model build on top of XLM-R (base) trained on DA from WMT17, WMT18 and WMT19 |
↓wmt-large-hter-estimator |
Estimator model build on top of XLM-R (large) trained to regress on HTER. |
↓wmt-base-hter-estimator |
Estimator model build on top of XLM-R (base) trained to regress on HTER. |
↑emnlp-base-da-ranker |
Translation ranking model that uses XLM-R to encode sentences. This model was trained with WMT17 and WMT18 Direct Assessments Relative Ranks (DARR). |
QE-as-a-metric:
Model | Description |
---|---|
wmt-large-qe-estimator-1719 |
Quality Estimator model build on top of XLM-R (large) trained on DA from WMT17, WMT18 and WMT19. |
Train your own Metric:
Instead of using pretrained models your can train your own model with the following command:
comet train -f {config_file_path}.yaml
Supported encoders:
- Learning Joint Multilingual Sentence Representations with Neural Machine Translation
- Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- XLM-R: Unsupervised Cross-lingual Representation Learning at Scale
Tensorboard:
Launch tensorboard with:
tensorboard --logdir="experiments/"
Download Command:
To download public available corpora to train your new models you can use the download
command. For example to download the APEQUEST HTER corpus just run the following command:
comet download -d apequest --saving_path data/
unittest:
In order to run the toolkit tests you must run the following command:
coverage run --source=comet -m unittest discover
coverage report -m
Publications
@inproceedings{rei-etal-2020-comet,
title = "{COMET}: A Neural Framework for {MT} Evaluation",
author = "Rei, Ricardo and
Stewart, Craig and
Farinha, Ana C and
Lavie, Alon",
booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)",
month = nov,
year = "2020",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2020.emnlp-main.213",
pages = "2685--2702",
}
@inproceedings{rei-EtAl:2020:WMT,
author = {Rei, Ricardo and Stewart, Craig and Farinha, Ana C and Lavie, Alon},
title = {Unbabel's Participation in the WMT20 Metrics Shared Task},
booktitle = {Proceedings of the Fifth Conference on Machine Translation},
month = {November},
year = {2020},
address = {Online},
publisher = {Association for Computational Linguistics},
pages = {909--918},
}
@inproceedings{stewart-etal-2020-comet,
title = "{COMET} - Deploying a New State-of-the-art {MT} Evaluation Metric in Production",
author = "Stewart, Craig and
Rei, Ricardo and
Farinha, Catarina and
Lavie, Alon",
booktitle = "Proceedings of the 14th Conference of the Association for Machine Translation in the Americas (Volume 2: User Track)",
month = oct,
year = "2020",
address = "Virtual",
publisher = "Association for Machine Translation in the Americas",
url = "https://www.aclweb.org/anthology/2020.amta-user.4",
pages = "78--109",
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for unbabel_comet-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fb5dc0fe067b1d7613d50a268f9a05606de9ae59f74354b03cfa95408d9250e9 |
|
MD5 | 2edf721363c6625e9fb2c9d2351827c1 |
|
BLAKE2b-256 | 95aca094d9343a868f5eacc617a33c495f17e6599b8f5137fdc981c3b5f0e2a1 |