Bayesian Bradley-Terry model implementation in PyMC

These details have not been verified by PyPI

Project description

bbt-test

BBT-Test is a Python package for Bayesian Bradley-Terry model along with utilities for multi-algorithm multi-dataset statistical evaluation.

Installation
Quickstart
License

Installation

You can install bbt-test via pip:

pip install bbt-test

If needed, you can also install the latest development version directly from GitHub:

pip install git+https://github.com/scikit-fingerprints/bbt-test

Quickstart

To generate results from BBT model you need to first fit posterior MCMC samples. BBT-Test supports unpaired (1 metric readout per algorithm per dataset) and paired (multiple metric readouts per algorithm per dataset) data.

For hands-on example of using the package, check out our example notebook: 01_simple_bbt_comparison.ipynb.

Unpaired posterior fitting

Start with single dataframe in shape (n_datasets, n_algorithms), optionally this dataframe can contain a dataset column:

import pandas as pd

df = pd.DataFrame({
    "dataset": ["ds1", "ds2", "ds3"],
    "alg1": [0.8, 0.75, 0.9],
    "alg2": [0.7, 0.8, 0.85],
    "alg3": [0.9, 0.95, 0.88],
})

To generate data for BBT model, fit the BBTTest model with the dataframe

from bbttest import BBTTest

model = BBTTest(
    local_rope_value=0.01, # Here you can define what is a tie in case of unpaired data, default is None
    # In this case the model will assume that if difference is below 0.01 there's a tie.
).fit(
    df,
    dataset_col="dataset", # If dataset column is present, specify it here
)

Evaluating BBT when reporting errors

By default BBT assumes that the goal of the evaluation is to maximize the metric (e.g. when reporting F1 score or AUROC). In cases, when metrics reported in the dataframe should be minimized (e.g. RMSE), you can set the parameter maximize in BBTTest to False:

model = BBTTest(
    local_rope_value=0.01,
    maximize=False, # Set to False if the metric should be minimized
).fit(
    df,
    dataset_col="dataset",
)

Paired posterior fitting

BBTTest model supports two variants of input data for paired case, either a single dataframe with multiple rows per algorithm per dataset, or a pair of dataframes, one defining mean performance per algorithm, and the second with standard deviations.

import pandas as pd
from bbttest import BBTTest

df = pd.DataFrame({
    "dataset": ["ds1", "ds1", "ds1", "ds2", "ds2", "ds2", "ds3", "ds3", "ds3"],
    "alg1": [0.8, 0.82, 0.79, 0.75, 0.77, 0.74, 0.9, 0.91, 0.89],
    "alg2": [0.7, 0.72, 0.69, 0.8, 0.78, 0.81, 0.85, 0.86, 0.84],
    "alg3": [0.9, 0.92, 0.91, 0.95, 0.94, 0.96, 0.88, 0.87, 0.89],
})

model = BBTTest(
    local_rope_value=0.1, # In this case ties will be counted if the difference is below square root mean of
    # standard deviations multiplied by local_rope_value
).fit(
    df,
    dataset_col="dataset",
)

Generating BBT posterior statistics and interpretations

Once you obtained a fitted BBTTest model, you can generate statistic dataframe containing information about every hypothesis (i.e. every pair of algorithms). The table includes general statistics in form of mean and delta values, as well as probabilities of one algorithm being better than the other, or being tied. Additionally, by default the table contains weak and strong interpretations of the results based on ROPE values.

stats_df = model.posterior_table(
    rope_value=(0.45, 0.55), # Defines ROPE of hypothesis for interpretations
    control_model="alg1", # If provided, only hypotheses comparing to control_model will be included
    selected_models=["alg2"], # If provided, only hypotheses comparing selected_models will be included
)

print(stats_df)

          pair  mean  delta  above_50  in_rope weak_interpretation
0  alg1 > alg2  0.63   0.53      0.75     0.19             Unknown

Additionally, you can generate multiple hypothesis interpretations regarding control model for different ROPE values:

stats_df = model.rope_comparison_control_table(
    rope_values=[(0.4, 0.6), (0.45, 0.55), (0.48, 0.52)],
    control_model="alg1",
    interpretation="weak",
)

print(stats_df)

rope_value better_models equivalent_models worse_models unknown_models
0    (0.4, 0.6)                                                  alg3, alg1
1  (0.45, 0.55)                                                  alg3, alg1
2  (0.48, 0.52)                                                  alg3, alg1

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.0.4

May 8, 2026

0.0.3

Apr 23, 2026

0.0.2

Dec 5, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bbt_test-0.0.4.tar.gz (13.9 kB view details)

Uploaded May 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

bbt_test-0.0.4-py3-none-any.whl (17.0 kB view details)

Uploaded May 8, 2026 Python 3

File details

Details for the file bbt_test-0.0.4.tar.gz.

File metadata

Download URL: bbt_test-0.0.4.tar.gz
Upload date: May 8, 2026
Size: 13.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.11 {"installer":{"name":"uv","version":"0.11.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for bbt_test-0.0.4.tar.gz
Algorithm	Hash digest
SHA256	`ae46ac6a95414ed1935777aa04e49731f5fdb47fdb66317e546e060a739ec80a`
MD5	`5f393e8ff131a7c9564c41894bc4790d`
BLAKE2b-256	`143e4a48c82679f6115745a21ca21ab840d46956589ba418f9e29a75f711f2ff`

See more details on using hashes here.

File details

Details for the file bbt_test-0.0.4-py3-none-any.whl.

File metadata

Download URL: bbt_test-0.0.4-py3-none-any.whl
Upload date: May 8, 2026
Size: 17.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.11 {"installer":{"name":"uv","version":"0.11.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for bbt_test-0.0.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a0821da32ceda1c01672df24a501dfadb852024ffc92daf8e7f41a53a4d58af3`
MD5	`9661db2764c60c58a8b400eaa8f05fdc`
BLAKE2b-256	`702910208ef4226fcf33d2cdb3e65d494fcc67f6bd364b98c0b0de74f0a19033`

See more details on using hashes here.

bbt-test 0.0.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

bbt-test

Table of Contents

Installation

Quickstart

Unpaired posterior fitting

Evaluating BBT when reporting errors

Paired posterior fitting

Generating BBT posterior statistics and interpretations

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes