Skip to main content

Evalica, your favourite evaluation toolkit.

Project description

Evalica, your favourite evaluation toolkit

Evalica

Tests Read the Docs PyPI Version Anaconda.org crates.io Codecov CodSpeed Badge

Evalica [ɛˈʋalit͡sa] (eh-vah-lee-tsah) is an evaluation toolkit for statistical analysis, combining fast Rust implementations with Python APIs for ranking, reliability, and uncertainty estimation. Evalica is fully compatible with NumPy arrays and pandas data frames.

The logo was created using Recraft.

Installation

  • pip: pip install evalica
  • Anaconda: conda install conda-forge::evalica
  • Cargo: cargo add evalica

Pairwise Comparisons

Imagine that we would like to rank the different meals and have the following dataset of three comparisons produced by food experts.

Item X Item Y Winner
pizza burger x
burger sushi y
pizza sushi tie

Given this hypothetical example, Evalica takes these three columns and computes the outcome of the given pairwise comparison according to the chosen model. Note that the first argument is the column Item X, the second argument is the column Item Y, and the third argument corresponds to the column Winner.

>>> from evalica import elo, Winner
>>> result = elo(
...     ['pizza', 'burger', 'pizza'],
...     ['burger', 'sushi', 'sushi'],
...     [Winner.X, Winner.Y, Winner.Draw],
... )
>>> result.scores
pizza     1014.972058
burger     970.647200
sushi     1014.380742
Name: elo, dtype: float64

As a result, we obtain Elo scores of our items. In this example, pizza was the most favoured item, sushi was the runner-up, and burger was the least preferred item.

Item Score
pizza 1014.97
burger 970.65
sushi 1014.38

Inter-Rater Reliability

Evalica also supports computing Krippendorff's alpha, a statistical measure of inter-rater reliability. Unlike pairwise comparisons, alpha accepts a matrix where rows represent raters (observers) and columns represent units (items being rated).

>>> import pandas as pd
>>> from evalica import alpha
>>> data = pd.DataFrame([
...     [1, 1, None, 1],
...     [2, 2, 3, 2],
...     [3, 3, 3, 3],
...     [3, 3, 3, 3],
...     [2, 2, 2, 2],
...     [1, 2, 3, 4],
...     [4, 4, 4, 4],
...     [1, 1, 2, 1],
...     [2, 2, 2, 2],
...     [None, 5, 5, 5],
...     [None, None, 1, 1],
... ]).T
>>> result = alpha(data, distance='nominal')
>>> result.alpha
0.7434210526315788
>>> from evalica import alpha_bootstrap
>>> bootstrap_result = alpha_bootstrap(data, distance='nominal', n_resamples=1000, random_state=42)
>>> (bootstrap_result.low, bootstrap_result.high)
(0.4431818181818182, 0.9411764705882353)

This example demonstrates computing alpha and its bootstrap confidence intervals with nominal distance for categorical ratings. Evalica supports multiple distance metrics: nominal, ordinal, interval, ratio, or custom distance functions.

Command-Line Interface

Evalica also provides a simple command-line interface, allowing the use of these methods in shell scripts and for prototyping.

Pairwise Ranking

$ evalica -i food.csv pairwise bradley-terry
item,score,rank
Tacos,2.509025136024378,1
Sushi,1.1011561298265815,2
Burger,0.8549063627182466,3
Pasta,0.7403814336665869,4
Pizza,0.5718366915548537,5

Refer to the food.csv file as an input example.

Krippendorff's Alpha

For Krippendorff's alpha, use a CSV file with ratings in a matrix format (no header):

$ evalica -i codings.csv alpha --distance=nominal
metric,value
alpha,0.743421052631579
observed,7.999999999999999
expected,31.179487179487182

Web Application

Evalica has a built-in Gradio application that can be launched as python3 -m evalica.gradio. Please ensure that the library was installed as pip install evalica[gradio].

Implemented Methods

Method In Python In Rust
Counting
Average Win Rate
Bradley–Terry
Elo
Eigenvalue
PageRank
Newman
Krippendorff's Alpha

Contributing

Evalica is a mixed Rust/Python project that uses PyO3, so it requires setting up the Maturin build system.

To set up the environment, we recommend using the uv package manager, as demonstrated in our test suite:

$ uv venv
$ uv pip install maturin
$ source .venv/bin/activate
$ maturin develop --uv --extras dev,docs,gradio

In case uv is not available, you can use the following workaround:

$ python3 -m venv venv
$ source venv/bin/activate
$ pip install maturin
$ maturin develop --extras dev,docs,gradio

It is also possible to omit the Rust-accelerated routines via pip install --no-binary evalica.

We welcome pull requests on GitHub: https://github.com/dustalov/evalica. To contribute, fork the repository, create a separate branch for your changes, and submit a pull request.

Citation

@inproceedings{Ustalov:25,
  author    = {Ustalov, Dmitry},
  title     = {{Reliable, Reproducible, and Really Fast Leaderboards with Evalica}},
  year      = {2025},
  booktitle = {Proceedings of the 31st International Conference on Computational Linguistics: System Demonstrations},
  pages     = {46--53},
  address   = {Abu Dhabi, UAE},
  publisher = {Association for Computational Linguistics},
  eprint    = {2412.11314},
  eprinttype = {arxiv},
  eprintclass = {cs.CL},
  url       = {https://aclanthology.org/2025.coling-demos.6},
  language  = {english},
}

The code for replicating the experiments is available in the coling2025 directory.

Copyright

Copyright (c) 2024–2026 Dmitry Ustalov. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

evalica-0.4.1.tar.gz (62.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

evalica-0.4.1-cp38-abi3-win_amd64.whl (279.4 kB view details)

Uploaded CPython 3.8+Windows x86-64

evalica-0.4.1-cp38-abi3-musllinux_1_1_x86_64.whl (640.9 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.1+ x86-64

evalica-0.4.1-cp38-abi3-musllinux_1_1_aarch64.whl (582.4 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.1+ ARM64

evalica-0.4.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (439.6 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

evalica-0.4.1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (406.8 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ARM64

evalica-0.4.1-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl (454.5 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.5+ i686

evalica-0.4.1-cp38-abi3-macosx_11_0_arm64.whl (369.4 kB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

evalica-0.4.1-cp38-abi3-macosx_10_12_x86_64.whl (398.3 kB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file evalica-0.4.1.tar.gz.

File metadata

  • Download URL: evalica-0.4.1.tar.gz
  • Upload date:
  • Size: 62.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for evalica-0.4.1.tar.gz
Algorithm Hash digest
SHA256 af9fd32b77c8215273a963445ff86bc2f05462d1544dc0da24d5574350c70cbf
MD5 1684888c4c7fd0a74eec6f0d9127052f
BLAKE2b-256 cc4983ada02942c098bb3b7521fb342ec4ecd117283845ab5063b776d70304a0

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.1.tar.gz:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.1-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: evalica-0.4.1-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 279.4 kB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for evalica-0.4.1-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 6f34b9c2152fb7aa25622a2a79baa02f824d0279d99a0cd3aa593ceed8d19572
MD5 6d0dd8f6ad392c8894d4cc348743fc52
BLAKE2b-256 a3a4538d3d8b39ad74955b32266770162c6eaca6d70cdbcdba1aca30c9dccc01

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.1-cp38-abi3-win_amd64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.1-cp38-abi3-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for evalica-0.4.1-cp38-abi3-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 c244024100a1e5f643e5f4f32fbfdc755c4d86e6974d0518dcbeb415021336ef
MD5 356639e164514a8652506655d91597f7
BLAKE2b-256 e14e66abbcda9dde92a613c46a8e20f191fcb560d7d4e15362c36fc6e11ea16b

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.1-cp38-abi3-musllinux_1_1_x86_64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.1-cp38-abi3-musllinux_1_1_aarch64.whl.

File metadata

File hashes

Hashes for evalica-0.4.1-cp38-abi3-musllinux_1_1_aarch64.whl
Algorithm Hash digest
SHA256 1a96909ae241af774c087f3c0b74b950967d5e0b3c1c27406b6b542e7409bb93
MD5 2246ded73c3bcee787be7bd2755ded66
BLAKE2b-256 4bee0a996d48778844158e77f2caba0848b28a84177cda65b0cd66861434e4dd

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.1-cp38-abi3-musllinux_1_1_aarch64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for evalica-0.4.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9ff45b3e7e39e3886d88b2c14534e32acd413aa8e9c75bec5ba8da4fc93e5a32
MD5 1cce7fc1481654785ed8af65140dd353
BLAKE2b-256 dc4c61c6f24693342aa70744bcfcdc089f22b2d49bb0f47b2dd989e8aacbace7

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for evalica-0.4.1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 69a0c1d998db9736016e148dfa7aa2a4ed68af947faddc42e119ccd4f128e128
MD5 3501ee268db4c1972931e97ce1775dc8
BLAKE2b-256 2a17cb0fbbfaede931df1f30f1f6b6522d8697643007b2567e947309bccc3e79

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.1-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl.

File metadata

File hashes

Hashes for evalica-0.4.1-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm Hash digest
SHA256 60a3e2029955056ea07a003fca2c93261b8d8403f5ab0854a77658bcaa187e47
MD5 0544d31cd0eb0de14410925d43020f01
BLAKE2b-256 fd7c53a8c708cfec29e0def682d43be642543d2d7f3cdadb7bc63dd022668795

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.1-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.1-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for evalica-0.4.1-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 933a5cbf5e02cbeabab8bc14d160f5e9386761da0407f8306a95158799868bf3
MD5 c50f157daec9b8cbb82fd4156e1aad37
BLAKE2b-256 38e4ef7334bc89416491c84d1ce8032c483138b9784276599ce6e9b23d77ec35

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.1-cp38-abi3-macosx_11_0_arm64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file evalica-0.4.1-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for evalica-0.4.1-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 93f1ec7d21ffc7431af36b81abb1d5526d30d300b68e1b7679791a7ca29f5eff
MD5 71e7057d6ce79e20b71deb09971e0127
BLAKE2b-256 532fad96b75f94b65c7e5494667f29b1c4dfa8a42052e4f245d4b13d6958f946

See more details on using hashes here.

Provenance

The following attestation bundles were made for evalica-0.4.1-cp38-abi3-macosx_10_12_x86_64.whl:

Publisher: deploy.yml on dustalov/evalica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page