Skip to main content

Evalica, your favourite evaluation toolkit.

Project description

Evalica, your favourite evaluation toolkit

Evalica

Tests Read the Docs PyPI Version Anaconda.org crates.io Codecov CodSpeed Badge

Evalica [ɛˈʋalit͡sa] (eh-vah-lee-tsah) is an evaluation toolkit for statistical analysis, combining fast Rust implementations with Python APIs for ranking, reliability, and uncertainty estimation. Evalica is fully compatible with NumPy arrays and pandas data frames.

The logo was created using Recraft.

Installation

  • pip: pip install evalica
  • Anaconda: conda install conda-forge::evalica
  • Cargo: cargo add evalica

Pairwise Comparisons

Imagine that we would like to rank the different meals and have the following dataset of three comparisons produced by food experts.

Item X Item Y Winner
pizza burger x
burger sushi y
pizza sushi tie

Given this hypothetical example, Evalica takes these three columns and computes the outcome of the given pairwise comparison according to the chosen model. Note that the first argument is the column Item X, the second argument is the column Item Y, and the third argument corresponds to the column Winner.

>>> from evalica import elo, Winner
>>> result = elo(
...     ['pizza', 'burger', 'pizza'],
...     ['burger', 'sushi', 'sushi'],
...     [Winner.X, Winner.Y, Winner.Draw],
... )
>>> result.scores
pizza     1014.972058
burger     970.647200
sushi     1014.380742
Name: elo, dtype: float64

As a result, we obtain Elo scores of our items. In this example, pizza was the most favoured item, sushi was the runner-up, and burger was the least preferred item.

Item Score
pizza 1014.97
burger 970.65
sushi 1014.38

Inter-Rater Reliability

Evalica also supports computing Krippendorff's alpha, a statistical measure of inter-rater reliability. Unlike pairwise comparisons, alpha accepts a matrix where rows represent raters (observers) and columns represent units (items being rated).

>>> import pandas as pd
>>> from evalica import alpha
>>> data = pd.DataFrame([
...     [1, 1, None, 1],
...     [2, 2, 3, 2],
...     [3, 3, 3, 3],
...     [3, 3, 3, 3],
...     [2, 2, 2, 2],
...     [1, 2, 3, 4],
...     [4, 4, 4, 4],
...     [1, 1, 2, 1],
...     [2, 2, 2, 2],
...     [None, 5, 5, 5],
...     [None, None, 1, 1],
... ]).T
>>> result = alpha(data, distance='nominal')
>>> result.alpha
0.7434210526315788

This example demonstrates computing alpha with nominal distance for categorical ratings. The result indicates substantial agreement among raters (alpha ≈ 0.74). Evalica supports multiple distance metrics: nominal, ordinal, interval, ratio, or custom distance functions.

Command-Line Interface

Evalica also provides a simple command-line interface, allowing the use of these methods in shell scripts and for prototyping.

Pairwise Ranking

$ evalica -i food.csv pairwise bradley-terry
item,score,rank
Tacos,2.509025136024378,1
Sushi,1.1011561298265815,2
Burger,0.8549063627182466,3
Pasta,0.7403814336665869,4
Pizza,0.5718366915548537,5

Refer to the food.csv file as an input example.

Krippendorff's Alpha

For Krippendorff's alpha, use a CSV file with ratings in a matrix format (no header):

$ evalica -i codings.csv alpha --distance=nominal
metric,value
alpha,0.743421052631579
observed,7.999999999999999
expected,31.179487179487182

Web Application

Evalica has a built-in Gradio application that can be launched as python3 -m evalica.gradio. Please ensure that the library was installed as pip install evalica[gradio].

Implemented Methods

Method In Python In Rust
Counting
Average Win Rate
Bradley–Terry
Elo
Eigenvalue
PageRank
Newman
Krippendorff's Alpha

Contributing

Evalica is a mixed Rust/Python project that uses PyO3, so it requires setting up the Maturin build system.

To set up the environment, we recommend using the uv package manager, as demonstrated in our test suite:

$ uv venv
$ uv pip install maturin
$ source .venv/bin/activate
$ maturin develop --uv --extras dev,docs,gradio

In case uv is not available, you can use the following workaround:

$ python3 -m venv venv
$ source venv/bin/activate
$ pip install maturin
$ maturin develop --extras dev,docs,gradio

It is also possible to omit the Rust-accelerated routines via pip install --no-binary evalica.

We welcome pull requests on GitHub: https://github.com/dustalov/evalica. To contribute, fork the repository, create a separate branch for your changes, and submit a pull request.

Citation

@inproceedings{Ustalov:25,
  author    = {Ustalov, Dmitry},
  title     = {{Reliable, Reproducible, and Really Fast Leaderboards with Evalica}},
  year      = {2025},
  booktitle = {Proceedings of the 31st International Conference on Computational Linguistics: System Demonstrations},
  pages     = {46--53},
  address   = {Abu Dhabi, UAE},
  publisher = {Association for Computational Linguistics},
  eprint    = {2412.11314},
  eprinttype = {arxiv},
  eprintclass = {cs.CL},
  url       = {https://aclanthology.org/2025.coling-demos.6},
  language  = {english},
}

The code for replicating the experiments is available in the coling2025 directory.

Copyright

Copyright (c) 2024–2026 Dmitry Ustalov. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

evalica-0.4.0.post1.tar.gz (56.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

evalica-0.4.0.post1-cp38-abi3-win_amd64.whl (243.6 kB view details)

Uploaded CPython 3.8+Windows x86-64

evalica-0.4.0.post1-cp38-abi3-musllinux_1_1_x86_64.whl (601.8 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.1+ x86-64

evalica-0.4.0.post1-cp38-abi3-musllinux_1_1_aarch64.whl (558.5 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.1+ ARM64

evalica-0.4.0.post1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (401.5 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

evalica-0.4.0.post1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (383.0 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ARM64

evalica-0.4.0.post1-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl (415.7 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.5+ i686

evalica-0.4.0.post1-cp38-abi3-macosx_11_0_arm64.whl (346.4 kB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

evalica-0.4.0.post1-cp38-abi3-macosx_10_12_x86_64.whl (361.5 kB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file evalica-0.4.0.post1.tar.gz.

File metadata

  • Download URL: evalica-0.4.0.post1.tar.gz
  • Upload date:
  • Size: 56.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for evalica-0.4.0.post1.tar.gz
Algorithm Hash digest
SHA256 4ec1e857ffde448e582794e06d3336a46e93aa3fe64aec8ebbc7eadb53a66807
MD5 caa1ae4157b596c7dcc6e9b39d1720f0
BLAKE2b-256 fabc962208433dccca3a9056e9c8a78d8363d402f1cebfe1bfd86f32d6747ea0

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0.post1.tar.gz:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0.post1-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: evalica-0.4.0.post1-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 243.6 kB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for evalica-0.4.0.post1-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 810f3ceed4571aca77fc200378c700211359fd7329192309e85ae5c44dfb7c87
MD5 2e26d2c60476ca112816914913d24092
BLAKE2b-256 ab86b20c7745bba88799c708ea6cd9956d4d64f8335da085a1cb0722c65e0aa6

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0.post1-cp38-abi3-win_amd64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0.post1-cp38-abi3-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for evalica-0.4.0.post1-cp38-abi3-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 d5a2d47952f22851e4bebe04466b7bce91a43da61c40a2442cecd98a9b31fc06
MD5 a62c5de802f18ed467457db995e919f6
BLAKE2b-256 7c1422555d3b005b6d3f9257d95478684696153d7ff452024a579768b7e6dab6

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0.post1-cp38-abi3-musllinux_1_1_x86_64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0.post1-cp38-abi3-musllinux_1_1_aarch64.whl.

File metadata

File hashes

Hashes for evalica-0.4.0.post1-cp38-abi3-musllinux_1_1_aarch64.whl
Algorithm Hash digest
SHA256 8d3890ba403165a26dadecaf8c46bcd66a583965472e197e65e2f30c1caf630c
MD5 ddeea49fd5b06db7600cb6d7d350fefe
BLAKE2b-256 319e7755411f6697de07949430e1022668cf77df55d7639b6f5c8fa8f0dd0dac

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0.post1-cp38-abi3-musllinux_1_1_aarch64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0.post1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for evalica-0.4.0.post1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6a7c9e80f364a24406312105e44504808d1a1c97c34093725b1c1b9d6a36cd93
MD5 3bf0adf6093f790fbd446727790e0c93
BLAKE2b-256 be9e0aed08a899f5258290236be06ccc2460ccf79a02a4fe696c1753e3e9fbf2

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0.post1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0.post1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for evalica-0.4.0.post1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 288ad93255540e50038f089adbd58d5b95086d73c41e395b037ab9e429430ae2
MD5 862b430dbffa3c7328ccb6436e748b31
BLAKE2b-256 b2641e745f8077af8afb9b9c328cb63ecc993f55946141a03a030789f4b7f013

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0.post1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0.post1-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl.

File metadata

File hashes

Hashes for evalica-0.4.0.post1-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm Hash digest
SHA256 172a1e4c4415e7fd9027195577b2d68fa77fbb343ab075e185d3e3ad19ca542a
MD5 efd5058ccd19a614eddf10f38ce2cfb7
BLAKE2b-256 2da5bf81ceb39939a17d53944685062e33c216c91ee9cb27d60865e2ac5dbff2

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0.post1-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0.post1-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for evalica-0.4.0.post1-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9eb04bf81e6b9c7b2f37df78c7452eb6bc14d89a832b5f034cbbd017162c64ba
MD5 fdfa755e840ac98bd4559df21fc03727
BLAKE2b-256 6088b9cdc5d1e278e123f84936e7732ee63430da9b443518b70e5627de0bdeb0

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0.post1-cp38-abi3-macosx_11_0_arm64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0.post1-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for evalica-0.4.0.post1-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 9f3a20ca9d30d1344c70e808719b8e72db035ce3cf67f21bfac93e7326e83ea7
MD5 7fd13434023d47df7aa0d10f99626ebb
BLAKE2b-256 cd4ad434d7ce2acbea489d03bc9c97f1802773b60cc757d146a19c0912cb1723

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0.post1-cp38-abi3-macosx_10_12_x86_64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page