Skip to main content

Evalica, your favourite evaluation toolkit.

Project description

Evalica, your favourite evaluation toolkit

Evalica

Tests Read the Docs PyPI Version Anaconda.org crates.io Codecov CodSpeed Badge

Evalica [ɛˈʋalit͡sa] (eh-vah-lee-tsah) is an evaluation toolkit for statistical analysis, combining fast Rust implementations with Python APIs for ranking, reliability, and uncertainty estimation. Evalica is fully compatible with NumPy arrays and pandas data frames.

The logo was created using Recraft.

Installation

  • pip: pip install evalica
  • Anaconda: conda install conda-forge::evalica
  • Cargo: cargo add evalica

Pairwise Comparisons

Imagine that we would like to rank the different meals and have the following dataset of three comparisons produced by food experts.

Item X Item Y Winner
pizza burger x
burger sushi y
pizza sushi tie

Given this hypothetical example, Evalica takes these three columns and computes the outcome of the given pairwise comparison according to the chosen model. Note that the first argument is the column Item X, the second argument is the column Item Y, and the third argument corresponds to the column Winner.

>>> from evalica import elo, Winner
>>> result = elo(
...     ['pizza', 'burger', 'pizza'],
...     ['burger', 'sushi', 'sushi'],
...     [Winner.X, Winner.Y, Winner.Draw],
... )
>>> result.scores
pizza     1014.972058
burger     970.647200
sushi     1014.380742
Name: elo, dtype: float64

As a result, we obtain Elo scores of our items. In this example, pizza was the most favoured item, sushi was the runner-up, and burger was the least preferred item.

Item Score
pizza 1014.97
burger 970.65
sushi 1014.38

Inter-Rater Reliability

Evalica also supports computing Krippendorff's alpha, a statistical measure of inter-rater reliability. Unlike pairwise comparisons, alpha accepts a matrix where rows represent raters (observers) and columns represent units (items being rated).

>>> import pandas as pd
>>> from evalica import alpha
>>> data = pd.DataFrame([
...     [1, 1, None, 1],
...     [2, 2, 3, 2],
...     [3, 3, 3, 3],
...     [3, 3, 3, 3],
...     [2, 2, 2, 2],
...     [1, 2, 3, 4],
...     [4, 4, 4, 4],
...     [1, 1, 2, 1],
...     [2, 2, 2, 2],
...     [None, 5, 5, 5],
...     [None, None, 1, 1],
... ]).T
>>> result = alpha(data, distance='nominal')
>>> result.alpha
0.7434210526315788
>>> from evalica import alpha_bootstrap
>>> bootstrap_result = alpha_bootstrap(data, distance='nominal', n_resamples=1000, random_state=42)
>>> (bootstrap_result.low, bootstrap_result.high)
(0.4431818181818182, 0.9411764705882353)

This example demonstrates computing alpha and its bootstrap confidence intervals with nominal distance for categorical ratings. Evalica supports multiple distance metrics: nominal, ordinal, interval, ratio, or custom distance functions.

Command-Line Interface

Evalica also provides a simple command-line interface, allowing the use of these methods in shell scripts and for prototyping.

Pairwise Ranking

$ evalica -i food.csv pairwise bradley-terry
item,score,rank
Tacos,2.509025136024378,1
Sushi,1.1011561298265815,2
Burger,0.8549063627182466,3
Pasta,0.7403814336665869,4
Pizza,0.5718366915548537,5

Refer to the food.csv file as an input example.

Krippendorff's Alpha

For Krippendorff's alpha, use a CSV file with ratings in a matrix format (no header):

$ evalica -i codings.csv alpha --distance=nominal
metric,value
alpha,0.743421052631579
observed,7.999999999999999
expected,31.179487179487182

Web Application

Evalica has a built-in Gradio application that can be launched as python3 -m evalica.gradio. Please ensure that the library was installed as pip install evalica[gradio].

Implemented Methods

Method In Python In Rust
Counting
Average Win Rate
Bradley–Terry
Elo
Eigenvalue
PageRank
Newman
Krippendorff's Alpha

Contributing

Evalica is a mixed Rust/Python project that uses PyO3, so it requires setting up the Maturin build system.

To set up the environment, we recommend using the uv package manager, as demonstrated in our test suite:

$ uv venv
$ uv pip install maturin
$ source .venv/bin/activate
$ maturin develop --uv --extras dev,docs,gradio

In case uv is not available, you can use the following workaround:

$ python3 -m venv venv
$ source venv/bin/activate
$ pip install maturin
$ maturin develop --extras dev,docs,gradio

It is also possible to omit the Rust-accelerated routines via pip install --no-binary evalica.

We welcome pull requests on GitHub: https://github.com/dustalov/evalica. To contribute, fork the repository, create a separate branch for your changes, and submit a pull request.

Citation

@inproceedings{Ustalov:25,
  author    = {Ustalov, Dmitry},
  title     = {{Reliable, Reproducible, and Really Fast Leaderboards with Evalica}},
  year      = {2025},
  booktitle = {Proceedings of the 31st International Conference on Computational Linguistics: System Demonstrations},
  pages     = {46--53},
  address   = {Abu Dhabi, UAE},
  publisher = {Association for Computational Linguistics},
  eprint    = {2412.11314},
  eprinttype = {arxiv},
  eprintclass = {cs.CL},
  url       = {https://aclanthology.org/2025.coling-demos.6},
  language  = {english},
}

The code for replicating the experiments is available in the coling2025 directory.

Copyright

Copyright (c) 2024–2026 Dmitry Ustalov. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

evalica-0.4.2.tar.gz (62.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

evalica-0.4.2-cp38-abi3-win_amd64.whl (279.3 kB view details)

Uploaded CPython 3.8+Windows x86-64

evalica-0.4.2-cp38-abi3-musllinux_1_1_x86_64.whl (635.6 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.1+ x86-64

evalica-0.4.2-cp38-abi3-musllinux_1_1_aarch64.whl (580.7 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.1+ ARM64

evalica-0.4.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (433.0 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

evalica-0.4.2-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (404.7 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ARM64

evalica-0.4.2-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl (451.2 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.5+ i686

evalica-0.4.2-cp38-abi3-macosx_11_0_arm64.whl (367.0 kB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

evalica-0.4.2-cp38-abi3-macosx_10_12_x86_64.whl (392.7 kB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file evalica-0.4.2.tar.gz.

File metadata

  • Download URL: evalica-0.4.2.tar.gz
  • Upload date:
  • Size: 62.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for evalica-0.4.2.tar.gz
Algorithm Hash digest
SHA256 f96c7872e89d82cc0661239ac1244745f34679dd594ff899d4bf0e00ba2b811c
MD5 2da45830fc799674dd6aeb26f9ea0928
BLAKE2b-256 aeee3c247aa094cfeb7d036a550f550c63cf500fe1e21127c7e5847160290228

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.2.tar.gz:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.2-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: evalica-0.4.2-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 279.3 kB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for evalica-0.4.2-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 95930c1f52c5a53ca3a62c11c0d390f5e41b411517aba8b78cde5194a26c7de5
MD5 c194874a8e4a674fc972d6213aa4593f
BLAKE2b-256 39102916aebbac1227784d8bb934522ab31445dd1a7d6ddef239fab79d7131ce

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.2-cp38-abi3-win_amd64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.2-cp38-abi3-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for evalica-0.4.2-cp38-abi3-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 ca62cff289139aaf8a1484d43348a8c600fa776c12ba1e8c9731c586ba39e16d
MD5 5b611f3fcbda747f6adf6e853ada32c1
BLAKE2b-256 8eeb942f07e6d108c22e4dd224f8563d3408708bd033f3d2f9d89237777bfa01

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.2-cp38-abi3-musllinux_1_1_x86_64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.2-cp38-abi3-musllinux_1_1_aarch64.whl.

File metadata

File hashes

Hashes for evalica-0.4.2-cp38-abi3-musllinux_1_1_aarch64.whl
Algorithm Hash digest
SHA256 fe465ef8f01fe820c0e717388f0451049191cf005818cf6edc112b9d879e561e
MD5 d2d16e2904f053eaa552cf804f9377b0
BLAKE2b-256 897c3db882ae74b5689f92e700d1e205eb805979a3bba9fac929977e798297b9

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.2-cp38-abi3-musllinux_1_1_aarch64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for evalica-0.4.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0e2859b82a901315fae92f7da9175ac2c4f79a7a56cd410f0932ea72442bfcdd
MD5 c06efac93bec13257cf5f916cc74fa29
BLAKE2b-256 8e543f11e95fdae4cdd5536aa140c855a71319861b1baaddd2436031d5f06f07

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.2-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for evalica-0.4.2-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 7458213578ca0e15ee6c914bc33c4b85a2f8dc3a688837c89942c24b0219c533
MD5 07001d1d4fe9f9ab7ee88d01dd2288b9
BLAKE2b-256 e5bdd73de5c4686178c6be09692484e5d970fce13b17f37e770d667cdc675918

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.2-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.2-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl.

File metadata

File hashes

Hashes for evalica-0.4.2-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm Hash digest
SHA256 65f8111e4c6ffde30ed3a0911fed2c7be4b0011f29a0d8f313f9b823f90d39eb
MD5 8df454587febd82c9d61242263b42f32
BLAKE2b-256 95b1a151a0dddd5d92c43b4eba471484e2298559fbdfd7106af5b8b6ec20d468

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.2-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.2-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for evalica-0.4.2-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 628a602f1d4c915ef48fd60705dd29f2b375f8849140a6485ab8aa91acb43160
MD5 d43d95926be75e2d9f0d9a07b13a84c7
BLAKE2b-256 7845b1feb5a0e00eeae4e2715f95723c5aa8817d16fc5f48a48af8af6caccc51

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.2-cp38-abi3-macosx_11_0_arm64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.2-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for evalica-0.4.2-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 6abbe80eed80a3884aaf97318e6518815223768614fe87129452bdfe6e26a694
MD5 7d3f778134bff37f3d9627d4f67d61fa
BLAKE2b-256 43491782f50167c39f5f2956502815e1bb9d6a80ffe472ba1122878436490067

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.2-cp38-abi3-macosx_10_12_x86_64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page