Skip to main content

A library of translation-based text similarity measures

Project description

NMTScore

Master PyPI

A library of translation-based text similarity measures.

To learn more about how these measures work, have a look at Jannis' blog post. Also, read our paper, "NMTScore: A Multilingual Analysis of Translation-based Text Similarity Measures" (Findings of EMNLP).

Three text similarity measures implemented in this library

Installation

  • Requires Python >= 3.8 and PyTorch
  • pip install nmtscore
  • Extra requirements for the Prism model: pip install nmtscore[prism]

Usage

NMTScorer

Instantiate a scorer and start scoring short sentence pairs.

from nmtscore import NMTScorer

scorer = NMTScorer()

scorer.score("This is a sentence.", "This is another sentence.")
# 0.4677300455046415

Different similarity measures

The library implements three different measures:

# Translation cross-likelihood (default)
scorer.score_cross_likelihood(a, b, tgt_lang="en", normalize=True, both_directions=True)

# Direct translation probability
scorer.score_direct(a, b, a_lang="en", b_lang="en", normalize=True, both_directions=True)

# Pivot translation probability
scorer.score_pivot(a, b, a_lang="en", b_lang="en", pivot_lang="en", normalize=True, both_directions=True)

The score method is a shortcut for cross-likelihood.

Batch processing

The scoring methods also accept lists of strings:

scorer.score(
    ["This is a sentence.", "This is a sentence.", "This is another sentence."],
    ["This is another sentence.", "This sentence is completely unrelated.", "This is another sentence."],
)
# [0.46772973967003206, 0.15306852595255185, 1.0]

The sentences in the first list are compared element-wise to the sentences in the second list.

The default batch size is 8. An alternative batch size can be specified as follows (independently for translating and scoring):

scorer.score_direct(
    a, b, a_lang="en", b_lang="en",
    score_kwargs={"batch_size": 16}
)

scorer.score_cross_likelihood(
    a, b,
    translate_kwargs={"batch_size": 16},
    score_kwargs={"batch_size": 16}
)

Different NMT models

This library currently supports four NMT models:

By default, the leanest model (small100) is loaded. The main results in the paper are based on the Prism model, which has some extra dependencies (see "Installation" above).

scorer = NMTScorer("small100", device=None)  # default
scorer = NMTScorer("small100", device="cuda:0")  # Enable faster inference on GPU
scorer = NMTScorer("m2m100_418M", device="cuda:0")
scorer = NMTScorer("m2m100_1.2B", device="cuda:0")
scorer = NMTScorer("prism", device="cuda:0")

Which model should I choose?

The page experiments/results/summary.md compares the models regarding their accuracy and latency.

  • Generally, we recommend Prism because it tends to have the highest accuracy. Also, Prism's implementation currently translates up 10x faster on GPU than the other models do, so we highly recommend to use Prism for the measures that require translation (score_pivot() and score_cross_likelihood()).
  • small100 is 3.4x faster for score_direct() and has 94–98% of Prism's accuracy.

Enable caching of NMT output

It can make sense to cache the translations and scores if they are needed repeatedly, e.g. in reference-based evaluation.

scorer.score_direct(
    a, b, a_lang="en", b_lang="en",
    score_kwargs={"use_cache": True}  # default: False
)

scorer.score_cross_likelihood(
    a, b,
    translate_kwargs={"use_cache": True},  # default: False
    score_kwargs={"use_cache": True}  # default: False
)

Activating this option will create an SQLite database in the ~/.cache directory. The directory can be overriden via the NMTSCORE_CACHE environment variable.

Print a version signature (à la SacreBLEU)

scorer.score(a, b, print_signature=True)
# NMTScore-cross|tgt-lang:en|model:alirezamsh/small100|normalized|both-directions|v0.3.0|hf4.26.1

Direct usage of NMT models

The NMT models also provide a direct interface for translating and scoring.

from nmtscore.models import load_translation_model

model = load_translation_model("small100")

model.translate("de", ["This is a test."])
# ["Das ist ein Test."]

model.score("de", ["This is a test."], ["Das ist ein Test."])
# [0.8293135166168213]

Experiments

See experiments/README.md

Citation

@inproceedings{vamvas-sennrich-2022-nmtscore,
    title = "{NMTS}core: A Multilingual Analysis of Translation-based Text Similarity Measures",
    author = "Vamvas, Jannis  and
      Sennrich, Rico",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2022",
    month = dec,
    year = "2022",
    address = "Abu Dhabi, United Arab Emirates",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.findings-emnlp.15",
    pages = "198--213"
}

License

  • Code: MIT License
  • Data: See data subdirectories

Changelog

  • v0.3.3

    • Update minimum required Python version to 3.8
    • Require transformers<4.34 to ensure compatibility for small100 model
    • m2m100/small100: Stop adding extra EOS tokens when scoring, which is not needed anymore
  • v0.3.2

    • Fix score calculation with small100 model (account for the fact that the target sequence is not prefixed with the target language, as is the case for m2m100).
    • Improve caching efficiency
  • v0.3.1

    • Implement the distilled small100 model by Mohammadshahi et al. (2022) and use this model by default.
    • Enable half-precision inference for m2m100 models and small100 by default; see (/experiments/results/summary.md) for benchmark results
  • v0.2.0

    • Bugfix: Provide source language to m2m100 models (#2). The fix is backwards-compatible but a warning is now raised if m2m100 is used without specifying the input language.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nmtscore-0.3.3.tar.gz (24.5 kB view details)

Uploaded Source

Built Distribution

nmtscore-0.3.3-py3-none-any.whl (22.4 kB view details)

Uploaded Python 3

File details

Details for the file nmtscore-0.3.3.tar.gz.

File metadata

  • Download URL: nmtscore-0.3.3.tar.gz
  • Upload date:
  • Size: 24.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.6

File hashes

Hashes for nmtscore-0.3.3.tar.gz
Algorithm Hash digest
SHA256 f0340e67b3267b504fa9e885a17ab7d5f166adfa754db1be24bccdf1525097ad
MD5 3285bab2c8dacdcb2bdd46fef359ec5d
BLAKE2b-256 0c5012e894934b671788dad003516332d0432b0f5588e415351605c4f5c2bbe5

See more details on using hashes here.

File details

Details for the file nmtscore-0.3.3-py3-none-any.whl.

File metadata

  • Download URL: nmtscore-0.3.3-py3-none-any.whl
  • Upload date:
  • Size: 22.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.6

File hashes

Hashes for nmtscore-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 5a5015e0b4b46b1ae5d0e0195128b5ce77275be3770fb1029a058e0e0534698f
MD5 ddfac68f596b5882d72728e08b10418d
BLAKE2b-256 bed7903545be2951ff17497aae8fbfa80ab34e2313db4b1b3de0dc2db64a50d2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page