Skip to main content

Evalica, your favourite evaluation toolkit.

Project description

Evalica, your favourite evaluation toolkit

Evalica

Tests Read the Docs PyPI Version Anaconda.org Codecov CodSpeed Badge

Evalica [ɛˈʋalit͡sa] (eh-vah-lee-tsah) is a Python library that transforms pairwise comparisons into ranked lists of items. It offers convenient high-performant Rust implementations of the corresponding methods via PyO3, and additionally provides naïve Python code for most of them. Evalica is fully compatible with NumPy arrays and pandas data frames.

The logo was created using Recraft.

Installation

  • pip: pip install evalica
  • pip without acceleration: pip install --no-binary evalica
  • Anaconda: conda install conda-forge::evalica

Pairwise Comparisons

Imagine that we would like to rank the different meals and have the following dataset of three comparisons produced by food experts.

Item X Item Y Winner
pizza burger x
burger sushi y
pizza sushi tie

Given this hypothetical example, Evalica takes these three columns and computes the outcome of the given pairwise comparison according to the chosen model. Note that the first argument is the column Item X, the second argument is the column Item Y, and the third argument corresponds to the column Winner.

>>> from evalica import elo, Winner
>>> result = elo(
...     ['pizza', 'burger', 'pizza'],
...     ['burger', 'sushi', 'sushi'],
...     [Winner.X, Winner.Y, Winner.Draw],
... )
>>> result.scores
pizza     1014.972058
burger     970.647200
sushi     1014.380742
Name: elo, dtype: float64

As a result, we obtain Elo scores of our items. In this example, pizza was the most favoured item, sushi was the runner-up, and burger was the least preferred item.

Item Score
pizza 1014.97
burger 970.65
sushi 1014.38

Inter-Rater Reliability

Evalica also supports computing Krippendorff's alpha, a statistical measure of inter-rater reliability. Unlike pairwise comparisons, alpha accepts a matrix where rows represent raters (observers) and columns represent units (items being rated).

>>> import pandas as pd
>>> from evalica import alpha
>>> data = pd.DataFrame([
...     [1, 1, None, 1],
...     [2, 2, 3, 2],
...     [3, 3, 3, 3],
...     [3, 3, 3, 3],
...     [2, 2, 2, 2],
...     [1, 2, 3, 4],
...     [4, 4, 4, 4],
...     [1, 1, 2, 1],
...     [2, 2, 2, 2],
...     [None, 5, 5, 5],
...     [None, None, 1, 1],
... ]).T
>>> result = alpha(data, distance='nominal')
>>> result.alpha
0.7434210526315788

This example demonstrates computing alpha with nominal distance for categorical ratings. The result indicates substantial agreement among raters (alpha ≈ 0.74). Evalica supports multiple distance metrics: nominal, ordinal, interval, ratio, or custom distance functions.

Command-Line Interface

Evalica also provides a simple command-line interface, allowing the use of these methods in shell scripts and for prototyping.

Pairwise Ranking

$ evalica -i food.csv pairwise bradley-terry
item,score,rank
Tacos,2.509025136024378,1
Sushi,1.1011561298265815,2
Burger,0.8549063627182466,3
Pasta,0.7403814336665869,4
Pizza,0.5718366915548537,5

Refer to the food.csv file as an input example.

Krippendorff's Alpha

For Krippendorff's alpha, use a CSV file with ratings in a matrix format (no header):

$ evalica -i codings.csv alpha --distance=nominal
metric,value
alpha,0.743421052631579
observed,7.999999999999999
expected,31.179487179487182

Web Application

Evalica has a built-in Gradio application that can be launched as python3 -m evalica.gradio. Please ensure that the library was installed as pip install evalica[gradio].

Implemented Methods

Method In Python In Rust
Counting
Average Win Rate
Bradley–Terry
Elo
Eigenvalue
PageRank
Newman
Krippendorff's Alpha

Contributing

Evalica is a mixed Rust/Python project that uses PyO3, so it requires setting up the Maturin build system.

To set up the environment, we recommend using the uv package manager, as demonstrated in our test suite:

$ uv venv
$ uv pip install maturin
$ source .venv/bin/activate
$ maturin develop --uv --extras dev,docs,gradio

In case uv is not available, you can use the following workaround:

$ python3 -m venv venv
$ source venv/bin/activate
$ pip install maturin
$ maturin develop --extras dev,docs,gradio

We welcome pull requests on GitHub: https://github.com/dustalov/evalica. To contribute, fork the repository, create a separate branch for your changes, and submit a pull request.

Citation

@inproceedings{Ustalov:25,
  author    = {Ustalov, Dmitry},
  title     = {{Reliable, Reproducible, and Really Fast Leaderboards with Evalica}},
  year      = {2025},
  booktitle = {Proceedings of the 31st International Conference on Computational Linguistics: System Demonstrations},
  pages     = {46--53},
  address   = {Abu Dhabi, UAE},
  publisher = {Association for Computational Linguistics},
  eprint    = {2412.11314},
  eprinttype = {arxiv},
  eprintclass = {cs.CL},
  url       = {https://aclanthology.org/2025.coling-demos.6},
  language  = {english},
}

The code for replicating the experiments is available in the coling2025 directory.

Copyright

Copyright (c) 2024–2026 Dmitry Ustalov. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

evalica-0.4.0rc1.tar.gz (51.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

evalica-0.4.0rc1-cp38-abi3-win_amd64.whl (250.0 kB view details)

Uploaded CPython 3.8+Windows x86-64

evalica-0.4.0rc1-cp38-abi3-musllinux_1_1_x86_64.whl (605.2 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.1+ x86-64

evalica-0.4.0rc1-cp38-abi3-musllinux_1_1_aarch64.whl (562.5 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.1+ ARM64

evalica-0.4.0rc1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (404.1 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

evalica-0.4.0rc1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (387.2 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ARM64

evalica-0.4.0rc1-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl (418.9 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.5+ i686

evalica-0.4.0rc1-cp38-abi3-macosx_11_0_arm64.whl (350.3 kB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

evalica-0.4.0rc1-cp38-abi3-macosx_10_12_x86_64.whl (365.8 kB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file evalica-0.4.0rc1.tar.gz.

File metadata

  • Download URL: evalica-0.4.0rc1.tar.gz
  • Upload date:
  • Size: 51.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for evalica-0.4.0rc1.tar.gz
Algorithm Hash digest
SHA256 682e20c38af36748a98372832df342bc12d11ff12092fd1f534f6a1b7b7f8a01
MD5 b702ea6be5ccbed375e26be769652830
BLAKE2b-256 124d937e056f3d112daf94f395e3ecae2c6a46c6af32be180e295d443f21677d

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0rc1.tar.gz:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0rc1-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: evalica-0.4.0rc1-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 250.0 kB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for evalica-0.4.0rc1-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 80590906ec841480be2e666f93f7a09507481d99dc5a476dba14bc8873313758
MD5 223ac83ee267a2e03edfd00cb5966881
BLAKE2b-256 51135b84764d1aa4127040f5111b0b3ecff30b096439b910a059025b0d4ea31a

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0rc1-cp38-abi3-win_amd64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0rc1-cp38-abi3-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for evalica-0.4.0rc1-cp38-abi3-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 9c80a8c05aea8ac0e0a29ed2fe1a22dfdc01a6136c705ccd72d57b285ca5169c
MD5 f94aa29998ecb82af03cdb8f37b8442f
BLAKE2b-256 6b00f96c30e0f764aac522f86adbde83b79ffdf2b71e7ab0f9345a8def71d01a

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0rc1-cp38-abi3-musllinux_1_1_x86_64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0rc1-cp38-abi3-musllinux_1_1_aarch64.whl.

File metadata

File hashes

Hashes for evalica-0.4.0rc1-cp38-abi3-musllinux_1_1_aarch64.whl
Algorithm Hash digest
SHA256 a90dd3630927b728239395ada03084600df8805c88f49ec3179e8f11f9bf8842
MD5 80c3b3487351bd611302ff1820806857
BLAKE2b-256 57ca9b5ed31d5aa6f1faa1c4ff14472f0c8263446c84d288f50977b9c3166ae8

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0rc1-cp38-abi3-musllinux_1_1_aarch64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0rc1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for evalica-0.4.0rc1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2240488f78fe42e43810952f2e54bad64bd6652bc64d6c2887cbab2ec6907f48
MD5 ff278df45e720b15d32e7bb6cc84b573
BLAKE2b-256 6d8779e96c19a9c71316d46834c7abd8e4ef0231a1001e7f23fc17b6c5e072f0

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0rc1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0rc1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for evalica-0.4.0rc1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 cd7bf7efd29119476c4bb4c532000361a0932c256262bac76b20c13aa99d9f6f
MD5 8311c9e5914ede9f2f346b495888c04b
BLAKE2b-256 4316cd96a5cb6d2e4458844dcdca8e1bd505a3eee51ce49f0218557ea2c4df62

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0rc1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0rc1-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl.

File metadata

File hashes

Hashes for evalica-0.4.0rc1-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm Hash digest
SHA256 8b659495b645511cad34ba1651d6c47fbb2dc7da7170ba9b42fa87d1211a8db7
MD5 14d064c528f9bdee4c84e6ad653d93e9
BLAKE2b-256 73d293362f301862c21788ff4293dc4495ac75be1e2d7d60520b994b72000ff0

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0rc1-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0rc1-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for evalica-0.4.0rc1-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d4810718ea583137c12725ccf44d3520c50cbdb7accb3bb147d7a64a9f944371
MD5 b0e06488489707ac7fe76bd76ec47a57
BLAKE2b-256 6e215bbf7f9464510e191406b579e578363b662c1a09615f35c930b55d8f88ac

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0rc1-cp38-abi3-macosx_11_0_arm64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.0rc1-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for evalica-0.4.0rc1-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 964940ca3469fc9644a14d4a97c61770978bbc5195c339d2e3e07c373c3eac6c
MD5 a856ca2be0b4289e1dbaceb85f201434
BLAKE2b-256 93e13e18ae2dcc6bf93c7c65222349ae9c60f7b75d6f9e253fc50e822aa33c72

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.0rc1-cp38-abi3-macosx_10_12_x86_64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page