Skip to main content

Evalica, your favourite evaluation toolkit.

Project description

Evalica, your favourite evaluation toolkit

Evalica

Tests Read the Docs PyPI Version Anaconda.org Codecov CodSpeed Badge

Evalica [ɛˈʋalit͡sa] (eh-vah-lee-tsah) is a Python library that transforms pairwise comparisons into ranked lists of items. It offers convenient high-performant Rust implementations of the corresponding methods via PyO3, and additionally provides naïve Python code for most of them. Evalica is fully compatible with NumPy arrays and pandas data frames.

The logo was created using Recraft.

Installation

  • pip: pip install evalica
  • pip without acceleration: pip install --no-binary evalica
  • Anaconda: conda install conda-forge::evalica

Pairwise Comparisons

Imagine that we would like to rank the different meals and have the following dataset of three comparisons produced by food experts.

Item X Item Y Winner
pizza burger x
burger sushi y
pizza sushi tie

Given this hypothetical example, Evalica takes these three columns and computes the outcome of the given pairwise comparison according to the chosen model. Note that the first argument is the column Item X, the second argument is the column Item Y, and the third argument corresponds to the column Winner.

>>> from evalica import elo, Winner
>>> result = elo(
...     ['pizza', 'burger', 'pizza'],
...     ['burger', 'sushi', 'sushi'],
...     [Winner.X, Winner.Y, Winner.Draw],
... )
>>> result.scores
pizza     1014.972058
burger     970.647200
sushi     1014.380742
Name: elo, dtype: float64

As a result, we obtain Elo scores of our items. In this example, pizza was the most favoured item, sushi was the runner-up, and burger was the least preferred item.

Item Score
pizza 1014.97
burger 970.65
sushi 1014.38

Inter-Rater Reliability

Evalica also supports computing Krippendorff's alpha, a statistical measure of inter-rater reliability. Unlike pairwise comparisons, alpha accepts a matrix where rows represent raters (observers) and columns represent units (items being rated).

>>> import pandas as pd
>>> from evalica import alpha
>>> data = pd.DataFrame([
...     [1, 1, None, 1],
...     [2, 2, 3, 2],
...     [3, 3, 3, 3],
...     [3, 3, 3, 3],
...     [2, 2, 2, 2],
...     [1, 2, 3, 4],
...     [4, 4, 4, 4],
...     [1, 1, 2, 1],
...     [2, 2, 2, 2],
...     [None, 5, 5, 5],
...     [None, None, 1, 1],
... ]).T
>>> result = alpha(data, distance='nominal')
>>> result.alpha
0.7434210526315788

This example demonstrates computing alpha with nominal distance for categorical ratings. The result indicates substantial agreement among raters (alpha ≈ 0.74). Evalica supports multiple distance metrics: nominal, ordinal, interval, ratio, or custom distance functions.

Command-Line Interface

Evalica also provides a simple command-line interface, allowing the use of these methods in shell scripts and for prototyping.

Pairwise Ranking

$ evalica -i food.csv pairwise bradley-terry
item,score,rank
Tacos,2.509025136024378,1
Sushi,1.1011561298265815,2
Burger,0.8549063627182466,3
Pasta,0.7403814336665869,4
Pizza,0.5718366915548537,5

Refer to the food.csv file as an input example.

Krippendorff's Alpha

For Krippendorff's alpha, use a CSV file with ratings in a matrix format (no header):

$ evalica -i codings.csv alpha --distance=nominal
metric,value
alpha,0.743421052631579
observed,7.999999999999999
expected,31.179487179487182

Web Application

Evalica has a built-in Gradio application that can be launched as python3 -m evalica.gradio. Please ensure that the library was installed as pip install evalica[gradio].

Implemented Methods

Method In Python In Rust
Counting
Average Win Rate
Bradley–Terry
Elo
Eigenvalue
PageRank
Newman
Krippendorff's Alpha

Contributing

Evalica is a mixed Rust/Python project that uses PyO3, so it requires setting up the Maturin build system.

To set up the environment, we recommend using the uv package manager, as demonstrated in our test suite:

$ uv venv
$ uv pip install maturin
$ source .venv/bin/activate
$ maturin develop --uv --extras dev,docs,gradio

In case uv is not available, you can use the following workaround:

$ python3 -m venv venv
$ source venv/bin/activate
$ pip install maturin
$ maturin develop --extras dev,docs,gradio

We welcome pull requests on GitHub: https://github.com/dustalov/evalica. To contribute, fork the repository, create a separate branch for your changes, and submit a pull request.

Citation

@inproceedings{Ustalov:25,
  author    = {Ustalov, Dmitry},
  title     = {{Reliable, Reproducible, and Really Fast Leaderboards with Evalica}},
  year      = {2025},
  booktitle = {Proceedings of the 31st International Conference on Computational Linguistics: System Demonstrations},
  pages     = {46--53},
  address   = {Abu Dhabi, UAE},
  publisher = {Association for Computational Linguistics},
  eprint    = {2412.11314},
  eprinttype = {arxiv},
  eprintclass = {cs.CL},
  url       = {https://aclanthology.org/2025.coling-demos.6},
  language  = {english},
}

The code for replicating the experiments is available in the coling2025 directory.

Copyright

Copyright (c) 2024–2026 Dmitry Ustalov. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

evalica-0.4.0rc2.tar.gz (56.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

evalica-0.4.0rc2-cp38-abi3-win_amd64.whl (243.5 kB view details)

Uploaded CPython 3.8+Windows x86-64

evalica-0.4.0rc2-cp38-abi3-musllinux_1_1_x86_64.whl (601.5 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.1+ x86-64

evalica-0.4.0rc2-cp38-abi3-musllinux_1_1_aarch64.whl (558.4 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.1+ ARM64

evalica-0.4.0rc2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (401.4 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

evalica-0.4.0rc2-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (382.8 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ARM64

evalica-0.4.0rc2-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl (415.7 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.5+ i686

evalica-0.4.0rc2-cp38-abi3-macosx_11_0_arm64.whl (346.4 kB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

evalica-0.4.0rc2-cp38-abi3-macosx_10_12_x86_64.whl (361.7 kB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file evalica-0.4.0rc2.tar.gz.

File metadata

  • Download URL: evalica-0.4.0rc2.tar.gz
  • Upload date:
  • Size: 56.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for evalica-0.4.0rc2.tar.gz
Algorithm Hash digest
SHA256 9a35a3fc7057b8765f244cafcc1304de819b006cdaed560c78e0526bccd4fe03
MD5 e2798a423130050cb267625a8f843dc1
BLAKE2b-256 36426d9810a5a7e66b297a1a58bf4460c7f892e3e8d2824363483409ee46af95

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0rc2.tar.gz:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0rc2-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: evalica-0.4.0rc2-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 243.5 kB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for evalica-0.4.0rc2-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 6b94adba6e7e47e4430bafc33fed4521f7ba9a1c8843e00d09236db476b48571
MD5 0c3e655e82d9aa01b9f1c6cc2f95bc26
BLAKE2b-256 b65b619ac8b6bacd1f55a53bda4b555ef5746443944127099d7ac351a0eb14d9

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0rc2-cp38-abi3-win_amd64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0rc2-cp38-abi3-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for evalica-0.4.0rc2-cp38-abi3-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 197d109d8c51cccaa4528254e40b985822206d19044dbad5603eb39860d2afde
MD5 800a97e4d14b210702f87246d119c224
BLAKE2b-256 a82cc766d606601677615687e041329e3d06de054c196311a1bf13fb46154d9a

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0rc2-cp38-abi3-musllinux_1_1_x86_64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0rc2-cp38-abi3-musllinux_1_1_aarch64.whl.

File metadata

File hashes

Hashes for evalica-0.4.0rc2-cp38-abi3-musllinux_1_1_aarch64.whl
Algorithm Hash digest
SHA256 cde453d3f9658b38cd6574c7278dc26f1c1bc0aa7d2c00e0de7549c8ad0709af
MD5 187e0d4ba79c91df8c3ce51214c26cb6
BLAKE2b-256 44af6d422b6be15cd81312f6b9f11277bb3f025eef1e3dc9e991580b804aefeb

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0rc2-cp38-abi3-musllinux_1_1_aarch64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0rc2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for evalica-0.4.0rc2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 94f5b4a2de86f5c8f356a175e81db1fd2372a418cb38f96601fbb999a71766cd
MD5 342702cece41c082fce68d18c1395fe3
BLAKE2b-256 6ca1e375fe17001924b233f26dac9384b093bb256846e511774728a50b0c69c1

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0rc2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0rc2-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for evalica-0.4.0rc2-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 7971126e3059654d4227957aae2e965c435fe64c46b46b0f81b714705ff0abf7
MD5 45edafb0cdf6ee9c297687c1fc8ba67e
BLAKE2b-256 584ea2ef04e3ea263d685ba6bf8b2246291ed44e97bb0eeda7675c6bdb2e5f3b

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0rc2-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0rc2-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl.

File metadata

File hashes

Hashes for evalica-0.4.0rc2-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm Hash digest
SHA256 cf4ad9cfc8226fa2394c1dd6abc99eb6322a32d3048bd0d74445180800af60a1
MD5 f812d35dd33626060b50098566f3b0e7
BLAKE2b-256 e863f70bdb66b34fc94efa9526ae6c9265afc4854d92b05d4ee537c41dfaf4a6

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0rc2-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0rc2-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for evalica-0.4.0rc2-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b04cc3f2db04687077f8ab12d07241bbc8c9a2a69a3c528a35f4b3ae4cab7996
MD5 a38de2a538ad3103e47c7fbf896f17f5
BLAKE2b-256 f88378749af169f59aadd1a5347c08cc297775d013763c875af8601331cae7d0

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0rc2-cp38-abi3-macosx_11_0_arm64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0rc2-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for evalica-0.4.0rc2-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 78f4378c26a0e35dc889a004bd88e0cee5b03e8ffca7c954e4a14d4bc6e82fa8
MD5 ea05e95e0f469658acab14365f6b5413
BLAKE2b-256 60c5f3ef35017eac33763dd35b30076a0bf4c94505bcfaa06ab66ea99b25ce3f

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0rc2-cp38-abi3-macosx_10_12_x86_64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page