A framework for evaluating counterfactual explanations.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

betulb

These details have not been verified by PyPI

Project links

Paper

Project description

CEval — Counterfactual Explanation Evaluator

CEval is a lightweight Python package for evaluating the quality of counterfactual explanations produced by post-hoc XAI (Explainable AI) methods. It computes 14 established metrics with a single call and works with diverse architectures.

Paper: Bayrak, B., & Bach, K. (2024). Evaluation of Instance-Based Explanations: An In-Depth Analysis of Counterfactual Evaluation Metrics, Challenges, and the CEval Toolkit. IEEE Access. doi:10.1109/ACCESS.2024.3410540

Why CEval?

When you build or compare counterfactual explainers, you need more than one number to judge quality. CEval lets you measure all key dimensions (validity, proximity, sparsity, diversity, feasibility, and more) in a single unified framework, across different explainers and datasets.

from ceval import CEval

evaluator = CEval(samples=test_df, label="income", data=train_df, model=clf)
evaluator.add_explainer("DiCE",  dice_cfs,  "generated-cf")
evaluator.add_explainer("DICE+", dicep_cfs, "generated-cf")

print(evaluator.comparison_table)

Installation

pip install CEval

Requirements: Python ≥ 3.9, pandas, numpy, scikit-learn, scipy, gower, category-encoders

Metrics

Metric	Description	Needs model	Needs data
`validity`	Fraction of CFs that actually flip the classifier's prediction	✓	✓
`proximity`	Average feature-space distance between instance and its CF
`proximity_gower`	Proximity using the Gower mixed-type distance		✓
`sparsity`	Average fraction of features changed
`count`	Average number of CFs per instance
`diversity`	Determinant-based spread of the CF set
`diversity_lcc`	Diversity weighted by label-class coverage
`yNN`	Label consistency of the CF's k nearest neighbours	✓	✓
`feasibility`	Average kNN distance of CFs to the training set		✓
`kNLN_dist`	Distance of CF to nearest same-class neighbour		✓
`relative_dist`	dist(x, CF) / dist(x, NUN)		✓
`redundancy`	Average number of unnecessary feature changes	✓	✓
`plausibility`	dist(CF, NLN) / dist(NLN, NUN(NLN))	✓	✓
`constraint_violation`	Fraction of CFs that break user constraints

Not every metric applies to every explanation type, CEval handles this automatically and fills non-applicable cells with "-".

Quick Start

import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from ceval import CEval

# 1. Prepare your data
train_df = ...   # pd.DataFrame with features + label column
test_df  = ...   # pd.DataFrame with features + label column
clf      = RandomForestClassifier().fit(train_df.drop("label", axis=1),
                                        train_df["label"])

# 2. Generate counterfactuals with your favourite explainer
#    (DiCE, PertCF, DICE, NICE, etc.)
counterfactuals = ...   # pd.DataFrame, same columns as test_df

# 3. Evaluate
evaluator = CEval(
    samples    = test_df,        # instances to explain
    label      = "label",        # target column name
    data       = train_df,       # background dataset (unlocks more metrics)
    model      = clf,            # fitted classifier (unlocks more metrics)
    k_nn       = 5,              # neighbours for kNN-based metrics
    constraints= ["age"],        # features that must not change (optional)
)

evaluator.add_explainer(
    name            = "MyExplainer",
    explanations    = counterfactuals,
    exp_type        = "generated-cf",   # "generated-cf" | "existed-cf" |
                                        # "generated-factual" | "existed-factual"
    mode            = "1to1",           # "1to1" | "1toN"
)

print(evaluator.comparison_table)

Explanation types

`exp_type`	When to use
`"generated-cf"`	Counterfactuals synthesised by an algorithm (e.g. DiCE, PertCF)
`"existed-cf"`	Counterfactuals retrieved from the training set
`"generated-factual"`	Factual explanations generated by an algorithm
`"existed-factual"`	Factual explanations retrieved from the training set

Explanation modes

`mode`	DataFrame shape	When to use
`"1to1"`	Same number of rows as `samples`	One explanation per instance
`"1toN"`	Any number of rows + an `"instance"` column	Multiple explanations per instance

Model Compatibility

CEval works with any classifier, not just scikit-learn.
Use the built-in wrappers from ceval.wrappers to adapt your model:

Framework	Wrapper class	Import
scikit-learn	(none needed)	pass model directly
XGBoost	`XGBoostWrapper`	`from ceval.wrappers import XGBoostWrapper`
LightGBM	`LightGBMWrapper`	`from ceval.wrappers import LightGBMWrapper`
CatBoost	`CatBoostWrapper`	`from ceval.wrappers import CatBoostWrapper`
PyTorch	`TorchWrapper`	`from ceval.wrappers import TorchWrapper`
Keras / TensorFlow	`KerasWrapper`	`from ceval.wrappers import KerasWrapper`
Anything else	`GenericWrapper`	`from ceval.wrappers import GenericWrapper`

# PyTorch
from ceval.wrappers import TorchWrapper
model = TorchWrapper(my_net, num_classes=2, device="cuda")

# XGBoost  (works with XGBClassifier and native Booster)
from ceval.wrappers import XGBoostWrapper
model = XGBoostWrapper(xgb_clf)

# Keras / TensorFlow
from ceval.wrappers import KerasWrapper
model = KerasWrapper(keras_model, num_classes=3)

# Anything else — supply two callables
from ceval.wrappers import GenericWrapper
model = GenericWrapper(
    predict_fn       = lambda X: my_model.infer(X).argmax(axis=1),
    predict_proba_fn = lambda X: my_model.infer(X),
)

# Then use as normal
evaluator = CEval(samples=test_df, label="income", data=train_df, model=model)

If you pass an incompatible model without a wrapper, CEval raises a clear TypeError that tells you exactly which wrapper to use.

See examples/demo_adult_income.py for a complete working demo that:

Loads the Adult Income dataset
Trains a Random Forest classifier
Generates counterfactuals with DiCE
Evaluates them in both 1-to-1 and 1-to-N mode
Prints a full comparison table

python examples/demo_adult_income.py

Expected output:

                     DiCE (1-to-1)  DiCE (1-to-N)
validity                      0.90          0.867
proximity_gower               0.11          0.152
sparsity                      0.32          0.347
yNN                           0.68          0.713
feasibility                  48.21         183.44
redundancy                    0.80          0.733
constraint_violation          0.50          0.233
...

Comparing Multiple Explainers

evaluator = CEval(samples=test_df, label="label", data=train_df, model=clf)

evaluator.add_explainer("DiCE",   dice_cfs,   "generated-cf", mode="1toN")
evaluator.add_explainer("PertCF", pertcf_cfs, "generated-cf", mode="1toN")
evaluator.add_explainer("NICE",   nice_cfs,   "existed-cf",   mode="1to1")

# Side-by-side comparison
print(evaluator.comparison_table.T)

API Reference

`CEval:`

CEval(samples, label, ...)

Parameter	Type	Default	Description
`samples`	`pd.DataFrame`	required	Instances to be explained (includes label column)
`label`	`str`	required	Name of the target column
`data`	`pd.DataFrame`	`None`	Full background dataset; unlocks distribution-based metrics
`model`	sklearn estimator	`None`	Fitted classifier; unlocks prediction-based metrics
`k_nn`	`int`	`5`	Neighbours for kNN metrics
`encoder`	`str`	`None`	Category-encoder name for categoricals (default: OrdinalEncoder)
`distance`	`str`	`None`	scipy distance metric for proximity; `None` uses built-in mixed metric
`constraints`	`list[str]`	`None`	Feature names that must not change in valid CFs

`evaluator.add_explainer:`

evaluator.add_explainer(name, explanations, exp_type, mode="1to1")

Registers an explainer and computes all applicable metrics. Results are appended to evaluator.comparison_table.

`evaluator.comparison_table:`

A pd.DataFrame with one row per explainer and one column per metric. Non-applicable metrics show "-".

Citation

If you use CEval in your research, please cite:

@article{bayrak2024ceval,
  title   = {Evaluation of Instance-Based Explanations: An In-Depth Analysis of Counterfactual Evaluation Metrics, Challenges, and the CEval Toolkit},
  author  = {Bayrak, Bet{\"u}l and Bach, Kerstin},
  journal = {IEEE Access},
  year    = {2024},
  doi     = {10.1109/ACCESS.2024.3410540}
}

Related Work

This package is part of a broader research effort on counterfactual explanation methods:

PerCE — Personalised Counterfactual Explanations (IEEE)
PertCF — Perturbation-based Counterfactual Explainer (Paper | Code)

License

MIT © Betül Bayrak

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

betulb

These details have not been verified by PyPI

Project links

Paper

Release history Release notifications | RSS feed

This version

1.1.1

Apr 20, 2026

0.0.6

Mar 7, 2024

0.0.5

Mar 7, 2024

0.0.4

Mar 7, 2024

0.0.3

Mar 7, 2024

0.0.2

Mar 4, 2024

0.0.1

Dec 29, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ceval-1.1.1.tar.gz (15.6 kB view details)

Uploaded Apr 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ceval-1.1.1-py3-none-any.whl (16.4 kB view details)

Uploaded Apr 20, 2026 Python 3

File details

Details for the file ceval-1.1.1.tar.gz.

File metadata

Download URL: ceval-1.1.1.tar.gz
Upload date: Apr 20, 2026
Size: 15.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ceval-1.1.1.tar.gz
Algorithm	Hash digest
SHA256	`c87609c2cc4117d763f154bf6f7f6b665d601e3065138256d0de3edf850e08bd`
MD5	`6b06d7414235aa5da0b8ec974f3a4cae`
BLAKE2b-256	`046c8109d4b34b2fff3e2b7dd5039c4ad17f7aafe653bb43559b0d6cd8df2030`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ceval-1.1.1.tar.gz:

Publisher: python-publish.yml on b-bayrak/CEval

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ceval-1.1.1.tar.gz
- Subject digest: c87609c2cc4117d763f154bf6f7f6b665d601e3065138256d0de3edf850e08bd
- Sigstore transparency entry: 1342888714
- Sigstore integration time: Apr 20, 2026
Source repository:
- Permalink: b-bayrak/CEval@43347299d4508d562394b8ecabbfa7e5e844272e
- Branch / Tag: refs/tags/1.1.2
- Owner: https://github.com/b-bayrak
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@43347299d4508d562394b8ecabbfa7e5e844272e
- Trigger Event: release

File details

Details for the file ceval-1.1.1-py3-none-any.whl.

File metadata

Download URL: ceval-1.1.1-py3-none-any.whl
Upload date: Apr 20, 2026
Size: 16.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ceval-1.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1d4b672adfc06053f5caaeb65abe9e26438629d4376008866f45800e4ba39cbd`
MD5	`129d22f6169857d9838bb89e66385f4f`
BLAKE2b-256	`6b6fb9a8a67f0eb5bb41d50747d6547a210581f3f02d354913bdb7217f655af0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ceval-1.1.1-py3-none-any.whl:

Publisher: python-publish.yml on b-bayrak/CEval

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ceval-1.1.1-py3-none-any.whl
- Subject digest: 1d4b672adfc06053f5caaeb65abe9e26438629d4376008866f45800e4ba39cbd
- Sigstore transparency entry: 1342888742
- Sigstore integration time: Apr 20, 2026
Source repository:
- Permalink: b-bayrak/CEval@43347299d4508d562394b8ecabbfa7e5e844272e
- Branch / Tag: refs/tags/1.1.2
- Owner: https://github.com/b-bayrak
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@43347299d4508d562394b8ecabbfa7e5e844272e
- Trigger Event: release

CEval 1.1.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

CEval — Counterfactual Explanation Evaluator

Why CEval?

Installation

Metrics

Quick Start

Explanation types

Explanation modes

Model Compatibility

Comparing Multiple Explainers

API Reference

CEval:

evaluator.add_explainer:

evaluator.comparison_table:

Citation

Related Work

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`CEval:`

`evaluator.add_explainer:`

`evaluator.comparison_table:`