Skip to main content

No project description provided

Project description

Main PyPI

fastChrF

Fast computation of sentence-level ChrF, motivated by Minimum Bayes Risk decoding.

  • ChrF (Popović, 2015) is a string similarity metric based on character overlap.
  • Minimum Bayes Risk (MBR) decoding is a strategy for generating text from a language model that requires many pairwise comparisons of strings.

Installation

pip install fastchrf

Usage

Parallelized computation of pairwise ChrF scores

Use the fastchrf.pairwise_chrf function to compute the ChrF score between each hypothesis and each reference in a set of hypotheses and references:

from fastchrf import pairwise_chrf

hypotheses = ["The cat sat on the mat.", "The cat sat on the hat."]
references = ["The cat sat on the mat.", "The fat cat sat on the mat.", "A cat sat on a mat."]
pairwise_scores = pairwise_chrf([hypotheses], [references])

print(np.array(pairwise_scores))
# [[[100.          74.6319046   55.77074432]
#   [ 79.65373993  57.15287399  50.72182846]]]
  • pairwise_chrf works with a batch dimension, so pass a list of lists of hypotheses and a list of lists of references.
  • For each row in the batch, the function calculates the segment-level ChrF score between each hypothesis and each reference.
  • The output has shape (batch_size, num_hypotheses, num_references).

Faster alternative: A streamlined ChrF variant for MBR

fastchrf.pairwise_chrf compares each hypothesis to each reference. This is slow when the number of hypotheses and references is large, as is the case in MBR decoding. fastchrf.aggregate_chrf computes a streamlined variant of ChrF that is faster to compute:

from fastchrf import aggregate_chrf

hypotheses = ["The cat sat on the mat.", "The cat sat on the hat."]
references = ["The cat sat on the mat.", "The fat cat sat on the mat.", "A cat sat on a mat."]
aggregate_scores = aggregate_chrf([hypotheses], [references])

print(np.array(aggregate_scores))
# [[78.56389618 63.3719368 ]]
  • aggregate_chrf does not output individual scores for each reference. Instead, it outputs an aggregate score across references.
  • The output has shape (batch_size, num_hypotheses).
  • The aggregate score is not equal to the average of the individual scores, nor is it equal to standard multi-reference ChrF. See below for a formal description.

Function Signatures

def pairwise_chrf(hypotheses: List[List[str]], references: List[List[str]], char_order: int=6, beta: float=2.0, remove_whitespace: bool=True, eps_smoothing: bool=False) -> List[List[List[float]]]:
    """
    Returns a matrix of pairwise ChrF scores of shape batch_size x num_hypotheses x num_references
    
    :param hypotheses: A list of lists of hypotheses of shape batch_size x num_hypotheses
    :param references: A list of lists of references of shape batch_size x num_references
    :param char_order: An integer indicating the maximum order of the character n-grams. Defaults to 6.
    :param beta: A float indicating the beta parameter of the F-score. Defaults to 2.0.
    :param remove_whitespace: If `True`, remove whitespace when extracting character n-grams. Defaults to `True`.
    :param eps_smoothing: If `True`, add epsilon smoothing to the ChrF score. Defaults to `False`.
    :return: A list of lists of lists of floats.
    """

def aggregate_chrf(hypotheses: List[List[str]], references: List[List[str]], char_order: int=6, beta: float=2.0, remove_whitespace: bool=True, eps_smoothing: bool=False) -> List[List[float]]:
    """
    Returns a matrix of fastChrF scores of shape batch_size x num_hypotheses

    :param hypotheses: A list of lists of hypotheses of shape batch_size x num_hypotheses
    :param references: A list of lists of references of shape batch_size x num_references
    :param char_order: An integer indicating the maximum order of the character n-grams. Defaults to 6.
    :param beta: A float indicating the beta parameter of the F-score. Defaults to 2.0.
    :param remove_whitespace: If `True`, remove whitespace when extracting character n-grams. Defaults to `True`.
    :param eps_smoothing: If `True`, add epsilon smoothing to the ChrF score. Defaults to `False`.
    :return: A list of lists of lists of floats.
    """

Formal Description

Sentence-level ChrF (Popović, 2015) compares two strings by counting the number of character n-grams that they have in common.

Given a hypothesis $\textrm{hyp}$ and a reference $\textrm{ref}$, ChrF internally represents them as bags of character n-grams. Think of a Python Counter object that maps each n-gram to its count in the string.

Three operations on bags of n-grams are relevant for ChrF:

  1. Cardinality: The number of n-grams in the bag. This is denoted by $|\text{hyp}|$ and $|\text{ref}|$, respectively.
  2. Intersection: Creating a bag that for each n-gram contains the smaller of the two counts in the hypothesis and the reference. We denote this by $\textrm{hyp} \cap \textrm{ref}$.
  3. Sum: Creating a bag that for each n-gram contains the sum of the counts in the hypothesis and the reference. We denote this by $\textrm{hyp} \uplus \textrm{ref}$.

The standard ChrF score is an F-score that combines precision and recall of character n-grams:

\textrm{ChrF} = \frac{(1 + \beta^2) \cdot \textrm{ChrP} \cdot \textrm{ChrR}}{\beta^2 \cdot \textrm{ChrP} + \textrm{ChrR}},

where

\text{ChrP} = \frac{\textrm{hyp} \cap \textrm{ref}}{|\textrm{hyp}|}

and

\text{ChrR} = \frac{\textrm{hyp} \cap \textrm{ref}}{|\textrm{ref}|}.

(The parameter $\beta$ controls the relative importance of precision and recall.)

fastchrf.pairwise_chrf

Calculating pairwise ChrF scores is relevant for sampling-based MBR (Eikema & Aziz, 2022), where many samples and references are generated and then the sample with the highest expected utility is selected.

If ChrF is used as the utility metric for MBR, the expected utility of $\textrm{hyp}$ is calculated as the average ChrF score between $\textrm{hyp}$ and the set of references $R$:

\textrm{utility}_{\textrm{ChrF}}(\textrm{hyp}) = \frac{1}{|R|} \sum_{\textrm{ref} \in R} \textrm{ChrF}(\textrm{hyp}, \textrm{ref}).

Unfortunately, the number of intersections $\textrm{hyp} \cap \textrm{ref}$ that need to be calculated is quadratic in the number of hypotheses and references.

fastchrf.aggregate_chrf

The idea behind fastchrf.aggregate_chrf is to first create an "average" reference $\overline{\textrm{ref}}$ by averaging the bags of n-grams in $R$:

\overline{\textrm{ref}} = \frac{1}{|R|} \biguplus_{\textrm{ref} \in R} \textrm{ref}.

The utility is then calculated as the ChrF score between $\textrm{hyp}$ and $\overline{\textrm{ref}}$:

\textrm{utility}_{\textrm{fastChrF}}(\textrm{hyp}) = \textrm{ChrF}(\textrm{hyp}, \overline{\textrm{ref}}).

Because $\overline{\textrm{ref}}$ is the same for every $\textrm{hyp}$, the number of bag-of-ngram operations that need to be performed is now linear in the number of hypotheses. However, note that this formulation clearly differs from textbook ChrF. The functions $\textrm{utility}_{\textrm{ChrF}}$ and $\textrm{utility}_{\textrm{fastChrF}}$ are not equivalent.

Benchmarking

  • Up to 1024 medium-size hypotheses/references in German
  • Batch size 1
  • 64-core CPU
n SacreBLEU (ms) fastchrf.pairwise_chrf (ms) fastchrf.aggregate_chrf (ms)
1 0.49 ms 0.27 ms 0.34 ms
2 1.77 ms 0.51 ms 0.72 ms
4 6.56 ms 1.28 ms 1.04 ms
8 23.28 ms 2.88 ms 2.10 ms
16 95.18 ms 8.92 ms 3.78 ms
32 382.58 ms 30.33 ms 6.60 ms
64 1497.29 ms 106.99 ms 11.39 ms
128 6062.98 ms 409.86 ms 20.44 ms
256 24072.80 ms 1691.64 ms 40.17 ms
512 96216.99 ms 7465.06 ms 75.94 ms
1024 383965.22 ms 32262.39 ms 144.78 ms
A line graph visualizing the result in the table

[!CAUTION] fastChrF is not intended to be used as an evaluation metric. For evaluating NLG systems with the ChrF metric, use the implementation provided by sacreBLEU instead.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastchrf-0.1.0.tar.gz (87.7 kB view hashes)

Uploaded Source

Built Distributions

fastchrf-0.1.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

fastchrf-0.1.0-pp310-pypy310_pp73-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.3 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ s390x

fastchrf-0.1.0-pp310-pypy310_pp73-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.3 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ppc64le

fastchrf-0.1.0-pp310-pypy310_pp73-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARMv7l

fastchrf-0.1.0-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

fastchrf-0.1.0-pp310-pypy310_pp73-manylinux_2_12_i686.manylinux2010_i686.whl (1.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.12+ i686

fastchrf-0.1.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

fastchrf-0.1.0-pp39-pypy39_pp73-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.3 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ s390x

fastchrf-0.1.0-pp39-pypy39_pp73-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.3 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ppc64le

fastchrf-0.1.0-pp39-pypy39_pp73-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARMv7l

fastchrf-0.1.0-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

fastchrf-0.1.0-pp39-pypy39_pp73-manylinux_2_12_i686.manylinux2010_i686.whl (1.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.12+ i686

fastchrf-0.1.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

fastchrf-0.1.0-pp38-pypy38_pp73-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.3 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ s390x

fastchrf-0.1.0-pp38-pypy38_pp73-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.3 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ppc64le

fastchrf-0.1.0-pp38-pypy38_pp73-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARMv7l

fastchrf-0.1.0-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

fastchrf-0.1.0-pp38-pypy38_pp73-manylinux_2_12_i686.manylinux2010_i686.whl (1.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.12+ i686

fastchrf-0.1.0-cp313-cp313-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.3 MB view hashes)

Uploaded CPython 3.13 manylinux: glibc 2.17+ s390x

fastchrf-0.1.0-cp313-cp313-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.3 MB view hashes)

Uploaded CPython 3.13 manylinux: glibc 2.17+ ppc64le

fastchrf-0.1.0-cp313-cp313-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.2 MB view hashes)

Uploaded CPython 3.13 manylinux: glibc 2.17+ ARMv7l

fastchrf-0.1.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view hashes)

Uploaded CPython 3.13 manylinux: glibc 2.17+ ARM64

fastchrf-0.1.0-cp312-none-win_amd64.whl (188.9 kB view hashes)

Uploaded CPython 3.12 Windows x86-64

fastchrf-0.1.0-cp312-none-win32.whl (180.1 kB view hashes)

Uploaded CPython 3.12 Windows x86

fastchrf-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

fastchrf-0.1.0-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.3 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ s390x

fastchrf-0.1.0-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.3 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ppc64le

fastchrf-0.1.0-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.2 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ARMv7l

fastchrf-0.1.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ARM64

fastchrf-0.1.0-cp312-cp312-manylinux_2_12_i686.manylinux2010_i686.whl (1.2 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.12+ i686

fastchrf-0.1.0-cp312-cp312-macosx_11_0_arm64.whl (330.3 kB view hashes)

Uploaded CPython 3.12 macOS 11.0+ ARM64

fastchrf-0.1.0-cp312-cp312-macosx_10_12_x86_64.whl (333.3 kB view hashes)

Uploaded CPython 3.12 macOS 10.12+ x86-64

fastchrf-0.1.0-cp311-none-win_amd64.whl (188.9 kB view hashes)

Uploaded CPython 3.11 Windows x86-64

fastchrf-0.1.0-cp311-none-win32.whl (180.6 kB view hashes)

Uploaded CPython 3.11 Windows x86

fastchrf-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

fastchrf-0.1.0-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.3 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ s390x

fastchrf-0.1.0-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.3 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ppc64le

fastchrf-0.1.0-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.2 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARMv7l

fastchrf-0.1.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

fastchrf-0.1.0-cp311-cp311-manylinux_2_12_i686.manylinux2010_i686.whl (1.2 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.12+ i686

fastchrf-0.1.0-cp311-cp311-macosx_11_0_arm64.whl (331.3 kB view hashes)

Uploaded CPython 3.11 macOS 11.0+ ARM64

fastchrf-0.1.0-cp311-cp311-macosx_10_12_x86_64.whl (334.1 kB view hashes)

Uploaded CPython 3.11 macOS 10.12+ x86-64

fastchrf-0.1.0-cp310-none-win_amd64.whl (188.9 kB view hashes)

Uploaded CPython 3.10 Windows x86-64

fastchrf-0.1.0-cp310-none-win32.whl (180.5 kB view hashes)

Uploaded CPython 3.10 Windows x86

fastchrf-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

fastchrf-0.1.0-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.3 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ s390x

fastchrf-0.1.0-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.3 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ppc64le

fastchrf-0.1.0-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.2 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARMv7l

fastchrf-0.1.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

fastchrf-0.1.0-cp310-cp310-manylinux_2_12_i686.manylinux2010_i686.whl (1.2 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.12+ i686

fastchrf-0.1.0-cp310-cp310-macosx_11_0_arm64.whl (331.2 kB view hashes)

Uploaded CPython 3.10 macOS 11.0+ ARM64

fastchrf-0.1.0-cp310-cp310-macosx_10_12_x86_64.whl (334.2 kB view hashes)

Uploaded CPython 3.10 macOS 10.12+ x86-64

fastchrf-0.1.0-cp39-none-win_amd64.whl (189.1 kB view hashes)

Uploaded CPython 3.9 Windows x86-64

fastchrf-0.1.0-cp39-none-win32.whl (180.7 kB view hashes)

Uploaded CPython 3.9 Windows x86

fastchrf-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

fastchrf-0.1.0-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.3 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ s390x

fastchrf-0.1.0-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.3 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ppc64le

fastchrf-0.1.0-cp39-cp39-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.2 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARMv7l

fastchrf-0.1.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

fastchrf-0.1.0-cp39-cp39-manylinux_2_12_i686.manylinux2010_i686.whl (1.2 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.12+ i686

fastchrf-0.1.0-cp38-none-win_amd64.whl (188.9 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

fastchrf-0.1.0-cp38-none-win32.whl (180.6 kB view hashes)

Uploaded CPython 3.8 Windows x86

fastchrf-0.1.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

fastchrf-0.1.0-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.3 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ s390x

fastchrf-0.1.0-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.3 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ppc64le

fastchrf-0.1.0-cp38-cp38-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.2 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARMv7l

fastchrf-0.1.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.2 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

fastchrf-0.1.0-cp38-cp38-manylinux_2_12_i686.manylinux2010_i686.whl (1.2 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.12+ i686

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page