Skip to main content

Tool to guide you through reporting the use of COMET for machine translation evaluation.

Project description

SacreCOMET     PyPI Version test sacrecomet

Since its introduction, the COMET metric has blazed a trail in the machine translation community, given its strong correlation with human judgements of translation quality. Its success stems from being a modified pre-trained multilingual model finetuned for quality assessment. However, it being a machine learning model also gives rise to a new set of pitfalls that may not be widely known. We investigate these unexpected behaviours from three aspects: 1) technical: obsolete software versions and compute precision; 2) data: empty content, language mismatch, and translationese at test time as well as distribution and domain biases in training; 3) usage and reporting: multi-reference support and model referencing in the literature. All of these problems imply that COMET scores is not comparable between papers or even technical setups and we put forward our perspective on fixing each issue. Furthermore, we release the SacreCOMET package that can generate a signature for the software and model configuration as well as an appropriate citation. The goal of this work is to help the community make more sound use of the COMET metric.

Read the full paper Pitfalls and Outlooks in Using COMET.

Tool

The Python tool has two functionalities. First, it creates a signature with your setup and COMET model:

pip install sacrecomet

# Without anything will try to detect the local environment and will
# ask you questions about which COMET model you used.
# Example output: Python3.11.8|Comet2.2.2|fp32|unite-mup

sacrecomet 

# Arguments can also be specified non-interactively:

sacrecomet --model unite-mup --prec fp32

The other functionality is to find specific citations for COMET models that you're using:

sacrecomet cite --model Unbabel/xcomet-xl

https://arxiv.org/abs/2310.10482
@misc{guerreiro2023xcomet,
    title={xCOMET: Transparent Machine Translation Evaluation through Fine-grained Error Detection}, 
    ...

You can also list all the available models:

sacrecomet list

unbabel/wmt24-qe-task2-baseline
unbabel/wmt22-cometkiwi-da
unbabel/xcomet-xl
unbabel/xcomet-xxl
unbabel/towerinstruct-13b-v0.1
unbabel/towerinstruct-7b-v0.2
unbabel/towerbase-7b-v0.1
...

Experiments

Documentation TODO

Paper

Cite as:

@inproceedings{zouhar-etal-2024-pitfalls,
    title = "Pitfalls and Outlooks in Using {COMET}",
    author = "Zouhar, Vil{\'e}m and Chen, Pinzhen  and Lam, Tsz Kin  and Moghe, Nikita  and Haddow, Barry",
    editor = "Haddow, Barry and Kocmi, Tom  and Koehn, Philipp  and Monz, Christof",
    booktitle = "Proceedings of the Ninth Conference on Machine Translation",
    month = nov,
    year = "2024",
    address = "Miami, Florida, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.wmt-1.121/",
    doi = "10.18653/v1/2024.wmt-1.121",
    pages = "1272--1288",
}

YouTube presentation (click image)

Changelog

  • v1.0.1 (13 January 2025)
    • Stable release
  • v0.1.1 (13 January 2025)
    • Add r in the signature before references.
    • Add simple tests.
  • v0.1.0 (30 October 2024):
    • Add list command to list available models
    • Add references usage to the SacreCOMET usage.
    • Deprecate sacrecomet cite model_name positional model name specification. Citations now have to explicitly use the --model argument.
  • v0.0.1 (7 August 2024)
    • First release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sacrecomet-1.0.1.tar.gz (12.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sacrecomet-1.0.1-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file sacrecomet-1.0.1.tar.gz.

File metadata

  • Download URL: sacrecomet-1.0.1.tar.gz
  • Upload date:
  • Size: 12.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.5

File hashes

Hashes for sacrecomet-1.0.1.tar.gz
Algorithm Hash digest
SHA256 9cfb004774fea8ceee47468054c1a6c39127c42868e09cfc0498f676f180bb05
MD5 3aa8e3fa3b8320a23d951952c1ea672a
BLAKE2b-256 ed04f99aa4b5b609b178dcd68638a5e625ce1ae39d454516305193a712c15107

See more details on using hashes here.

File details

Details for the file sacrecomet-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: sacrecomet-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 10.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.5

File hashes

Hashes for sacrecomet-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0f41d561e84435c92b12bb1e28140798c1ac4489e8567b5fcb548cc1862a9898
MD5 7ffde5a8f14e0c3497d3beb4a7f516de
BLAKE2b-256 4e34f1cadff3498259913e43c5bf5ae295442ada9a236d73c38cdd262ed20836

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page