Skip to main content

A visual platform for contrastive evaluation of machine translation systems

Project description



License GitHub stars PyPI Code Style

MT-Telescope is a toolkit for comparative analysis of MT systems that provides a number of tools that add rigor and depth to MT evaluation. With this package we endeavour to make it easier for researchers and industry practitioners to compare MT systems by giving you easy access to:

  1. SOTA MT evaluation metrics such as COMET (rei, et al 2020).
  2. Statistical tests such as bootstrap resampling (Koehn, et al 2004).
  3. Dynamic Filters to select parts of your testset with specific phenomena
  4. Visual interface/plots to compare systems side-by-side segment-by-segment.

We highly recommend reading the following papers to learn more about how to perform better MT-Evaluation:

Install:

Via pip:

pip install mt-telescope==0.0.1rc1

Note: This is a pre-release currently.

Locally:

Create a virtual environment and make sure you have poetry installed.

Finally run:

git clone https://github.com/Unbabel/MT-Telescope
cd MT-Telescope
poetry install

Scoring:

To get the system level scores for a particular MT simply run telescope score.

telescope score -s {path/to/sources} -h {path/to/translations} -r {path/to/references} -l {target_language} -m COMET -m chrF

Comparing two systems:

For comparison between two systems you can run telescope using:

  1. The command line interface
  2. A web browser

Command Line Interface (CLI):

For running system comparisons with CLI you should use the telescope compare command.

Usage: telescope compare [OPTIONS]

Options:
  -s, --source FILENAME           Source segments.  [required]
  -x, --system_x FILENAME         System X MT outputs.  [required]
  -y, --system_y FILENAME         System Y MT outputs.  [required]
  -r, --reference FILENAME        Reference segments.  [required]
  -l, --language TEXT             Language of the evaluated text.  [required]
  -m, --metric [COMET|sacreBLEU|chrF|ZeroEdit|BLEURT|BERTScore|TER|Prism|GLEU]
                                  MT metric to run.  [required]
  -f, --filter [named-entities|duplicates]
                                  MT metric to run.
  --seg_metric [COMET|ZeroEdit|BLEURT|BERTScore|Prism|GLEU]
                                  Segment-level metric to use for segment-
                                  level analysis.

  -o, --output_folder TEXT        Folder you wish to use to save plots.
  --bootstrap
  --num_splits INTEGER            Number of random partitions used in
                                  Bootstrap resampling.

  --sample_ratio FLOAT            Folder you wish to use to save plots.
  --help                          Show this message and exit.

Example 1: Running several metrics

Running BLEU, chrF BERTScore and COMET to compare two systems:

telescope compare \
  -s path/to/src/file.txt \
  -x path/to/system-x/file.txt \
  -y path/to/system-y \
  -r path/to/ref/file.txt \
  -l en \
  -m BLEU -m chrF -m BERTScore -m COMET

Example 2: Saving a comparison report

telescope compare \
  -s path/to/src/file.txt \
  -x path/to/system-x/file.txt \
  -y path/to/system-y \
  -r path/to/ref/file.txt \
  -l en \
  -m COMET \
  --output_folder FOLDER-PATH

Web Interface

To run a web interface simply run:

streamlit run app.py

Cite:

@inproceedings{rei-etal-2021-mt, author = {Rei, Ricardo and Stewart, Craig and Farinha, Ana C and Lavie, Alon}, title = "{MT}-{T}elescope: {A}n interactive platform for contrastive evaluation of {MT} systems", booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Demonstrations)", month = aug, year = "2021", address = "Online", publisher = "Association for Computational Linguistics", }

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mt-telescope-0.0.1rc1.tar.gz (27.2 kB view details)

Uploaded Source

Built Distribution

mt_telescope-0.0.1rc1-py3-none-any.whl (49.5 kB view details)

Uploaded Python 3

File details

Details for the file mt-telescope-0.0.1rc1.tar.gz.

File metadata

  • Download URL: mt-telescope-0.0.1rc1.tar.gz
  • Upload date:
  • Size: 27.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.6.9

File hashes

Hashes for mt-telescope-0.0.1rc1.tar.gz
Algorithm Hash digest
SHA256 d76726cfbdd4ea908cfc0a82cdd8479717d183530e53b0561f3f24f10a598a05
MD5 ad4bf074319efabd76ebb2ff5bb5d2bd
BLAKE2b-256 f4b1460d58ac92a1f4116f3f8f9327b033de33c935406a0f9a0310e9dc4aa5c2

See more details on using hashes here.

File details

Details for the file mt_telescope-0.0.1rc1-py3-none-any.whl.

File metadata

  • Download URL: mt_telescope-0.0.1rc1-py3-none-any.whl
  • Upload date:
  • Size: 49.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.6.9

File hashes

Hashes for mt_telescope-0.0.1rc1-py3-none-any.whl
Algorithm Hash digest
SHA256 083f257779b8ceb8cdb34f3b912c5b170cb10bc2270de02d56c6a7edeb8ea7dc
MD5 d62d63fee1581272c8e1a8ba4a9dd058
BLAKE2b-256 dbf8892d16caa2d04dd867381030af7fb75c465dbaa58fe14858f2d2c0a9fcf7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page