Skip to main content

A fast bleu score calculator

Project description

bleuscore

codecov MIT licensed Crates.io PyPI - Version docs.rs

bleuscore is a fast BLEU score calculator written in rust.

Installation

The python package has been published to pypi, so we can install it directly with many ways:

  • pip

    pip install bleuscore
    
  • poetry

    poetry add bleuscore
    
  • uv

    uv pip install bleuscore
    

Quick Start

The usage is exactly same with huggingface evaluate:

- import evaluate
+ import bleuscore

predictions = ["hello there general kenobi", "foo bar foobar"]
references = [
    ["hello there general kenobi", "hello there !"],
    ["foo bar foobar"]
]

- bleu = evaluate.load("bleu")
- results = bleu.compute(predictions=predictions, references=references)
+ results = bleuscore.compute(predictions=predictions, references=references)

print(results)
# {'bleu': 1.0, 'precisions': [1.0, 1.0, 1.0, 1.0], 'brevity_penalty': 1.0, 
# 'length_ratio': 1.1666666666666667, 'translation_length': 7, 'reference_length': 6}

Benchmark

TLDR: We got more than 10x speedup when the corpus size beyond 100K

Benchmark

We use the demo data shown in quick start to do this simple benchmark. You can check the benchmark/simple for the benchmark source code.

  • rs_bleuscore: bleuscore python library
  • local_hf_bleu: huggingface evaluate bleu algorithm in local
  • sacre_bleu: sacrebleu
    • Note that we got different result with sacrebleu in the simple demo data and all the rests have same result
  • hf_evaluate: huggingface evaluate bleu algorithm with evaluate package

The N is used to enlarge the predictions/references size by simply duplication the demo data as shown before. We can see that as N increase, the bleuscore gets better performance. You can navigate benchmark for more benchmark details.

N=100

hyhyperfine --warmup 5 --runs 10   \
  "python simple/rs_bleuscore.py 100" \
  "python simple/local_hf_bleu.py 100" \
  "python simple/sacre_bleu.py 100"   \
  "python simple/hf_evaluate.py 100"

Benchmark 1: python simple/rs_bleuscore.py 100
  Time (mean ± σ):      19.0 ms ±   2.6 ms    [User: 17.8 ms, System: 5.3 ms]
  Range (min  max):    14.8 ms   23.2 ms    10 runs

Benchmark 2: python simple/local_hf_bleu.py 100
  Time (mean ± σ):      21.5 ms ±   2.2 ms    [User: 19.0 ms, System: 2.5 ms]
  Range (min  max):    16.8 ms   24.1 ms    10 runs

Benchmark 3: python simple/sacre_bleu.py 100
  Time (mean ± σ):      45.9 ms ±   2.2 ms    [User: 38.7 ms, System: 7.1 ms]
  Range (min  max):    43.5 ms   50.9 ms    10 runs

Benchmark 4: python simple/hf_evaluate.py 100
  Time (mean ± σ):      4.504 s ±  0.429 s    [User: 0.762 s, System: 0.823 s]
  Range (min  max):    4.163 s   5.446 s    10 runs

Summary
  python simple/rs_bleuscore.py 100 ran
    1.13 ± 0.20 times faster than python simple/local_hf_bleu.py 100
    2.42 ± 0.35 times faster than python simple/sacre_bleu.py 100
  237.68 ± 39.88 times faster than python simple/hf_evaluate.py 100

N = 1K ~ 1M

Command Mean [ms] Min [ms] Max [ms] Relative
python simple/rs_bleuscore.py 1000 20.3 ± 1.3 18.2 21.4 1.00
python simple/local_hf_bleu.py 1000 45.8 ± 1.2 44.2 47.5 2.26 ± 0.16
python simple/rs_bleuscore.py 10000 37.8 ± 1.5 35.9 39.5 1.87 ± 0.14
python simple/local_hf_bleu.py 10000 295.0 ± 5.9 288.6 304.2 14.55 ± 0.98
python simple/rs_bleuscore.py 100000 219.6 ± 3.3 215.3 224.0 10.83 ± 0.72
python simple/local_hf_bleu.py 100000 2781.4 ± 42.2 2723.1 2833.0 137.13 ± 9.10
python simple/rs_bleuscore.py 1000000 2048.8 ± 31.4 2013.2 2090.3 101.01 ± 6.71
python simple/local_hf_bleu.py 1000000 28285.3 ± 100.9 28182.1 28396.1 1394.51 ± 90.21

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bleuscore-0.1.3.tar.gz (1.1 MB view details)

Uploaded Source

Built Distributions

bleuscore-0.1.3-cp38-abi3-win_amd64.whl (734.7 kB view details)

Uploaded CPython 3.8+ Windows x86-64

bleuscore-0.1.3-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ x86-64

bleuscore-0.1.3-cp38-abi3-manylinux_2_17_i686.manylinux2014_i686.whl (1.0 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ i686

bleuscore-0.1.3-cp38-abi3-macosx_11_0_arm64.whl (846.6 kB view details)

Uploaded CPython 3.8+ macOS 11.0+ ARM64

bleuscore-0.1.3-cp38-abi3-macosx_10_12_x86_64.whl (894.7 kB view details)

Uploaded CPython 3.8+ macOS 10.12+ x86-64

File details

Details for the file bleuscore-0.1.3.tar.gz.

File metadata

  • Download URL: bleuscore-0.1.3.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.5.1

File hashes

Hashes for bleuscore-0.1.3.tar.gz
Algorithm Hash digest
SHA256 d78531815b0b8f7c66adecbf8097a3759eab8915764b744a1ce41ad7a5821e5d
MD5 7f7e47b3b67162cba01581a770960f15
BLAKE2b-256 9f4f5213bfb3e7f5a5f383a2a49279fbee95ecbb573eb2a53b00b3018d4faf24

See more details on using hashes here.

File details

Details for the file bleuscore-0.1.3-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for bleuscore-0.1.3-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 4d012285e6be18ca4621b44f56d1ee9b9e6daa1e76baab22331b242d7c1a35aa
MD5 edb0839406ac64ea88c3aa758dbabe32
BLAKE2b-256 3aa79a0790c7e1e782b22b6ac46b5f27e65058c55d2a091e5d391bb8a52847fd

See more details on using hashes here.

File details

Details for the file bleuscore-0.1.3-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for bleuscore-0.1.3-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a83796a6a6a7e1ce87dd4b4c48ef8dbdd24084704b8469def0eb6f4f4e87060f
MD5 5f3ed9d595f75fa23725fdebfd501f03
BLAKE2b-256 849f78628ea11dbd4b1cf4e90aedd738bcb1dd07cd14e281c9b8d7ced1832cff

See more details on using hashes here.

File details

Details for the file bleuscore-0.1.3-cp38-abi3-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for bleuscore-0.1.3-cp38-abi3-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 4ad55f76434fe01e0569bec6c053573e370f13b1a8a246d6a0bb9738c728608f
MD5 5c8241cae3d474121ef22a78b7afc29e
BLAKE2b-256 bc6453e57f6504c1a48c2191aae7eca1dfbec7ebb728c5f86f98e10eda8cc42f

See more details on using hashes here.

File details

Details for the file bleuscore-0.1.3-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for bleuscore-0.1.3-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 91c1037ef7324dc6f80defa9aa49902b46a69bba874ac3d2623b1f95495f0be3
MD5 f56f941678570952f2cedcbb11bf8ff2
BLAKE2b-256 6585eab7e8983df1ca4810bb0f8aef0b03afcc81e2f4fe66234434db21810267

See more details on using hashes here.

File details

Details for the file bleuscore-0.1.3-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for bleuscore-0.1.3-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 d41d4010fc6ab2429cbde3c66dfec93d94d6936a3f7c79f3ab0055548c09e237
MD5 8c60518f976c37750d3f810d7ea9b501
BLAKE2b-256 88054191cd3a8900e9e154b08c8e8263e5d7240e81ce712a95cfcd68cfb73809

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page