Skip to main content

A stress-testing framework for evaluating XAI explanation divergence.

Project description

xai-eval: XAI Explanation Disagreement Auditing Library

Tests Python License

A Python library for quantifying and stress-testing disagreement between Explainable AI (XAI) methods. Based on the disagreement framework proposed in Krishna et al. (2024), published in Transactions on Machine Learning Research.


The Problem

When you ask two explanation methods why your model made a decision:

  • LIME says → "Low income was the top reason"
  • SHAP says → "High debt was the top reason"

Same model. Same decision. Two different answers.

This library gives engineers a standardized way to measure, track, and stress-test this disagreement — and find the exact point where explanations become too unstable to trust.


Installation

git clone https://github.com/hemcharan710-afk/xai-eval-sandbox.git
cd xai-eval-sandbox
pip install -e ".[dev]"

Quick Start

from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification
from xai_eval.pipeline.evaluator import evaluate

# Train a model
X, y = make_classification(n_samples=1000, n_features=10, random_state=42)
model = LogisticRegression(max_iter=1000).fit(X, y)

# Run a full XAI audit
results = evaluate(
    model=model,
    X=X,
    explainer_1="shap",
    explainer_2="lime",
    degradation_levels=[0.01, 0.05, 0.1, 0.3, 0.5]
)

Output:

==============================================================
           XAI AUDIT REPORT
==============================================================
  Explainer 1   : shap
  Explainer 2   : lime
  Baseline RBO  : 0.972
  Baseline Jacc : 1.000
--------------------------------------------------------------
  Degradation    RBO Score    Jaccard      Status
--------------------------------------------------------------
  0.01           0.965        1.000        ✅ Safe
  0.05           0.931        0.500        ✅ Safe
  0.10           1.000        1.000        ✅ Safe
  0.30           1.000        1.000        ✅ Safe
  0.50           0.988        1.000        ✅ Safe
--------------------------------------------------------------
  ✅ Explainers stable across all levels. Safe to trust.
==============================================================

Interpreting Results

RBO Score Meaning
0.8 - 1.0 ✅ Safe — explainers strongly agree
0.5 - 0.8 ⚠️ Warning — explainers drifting apart
0.0 - 0.5 🚨 Danger — do not trust either explainer

Supported Explainers

Explainer String Name Works With
Linear SHAP "shap" Linear models
Tree SHAP "shap_tree" Tree models
Kernel SHAP "kernel_shap" Any model
LIME "lime" Any model
LIME (tree) "lime_tree" Tree models
Integrated Gradients "integrated_grads" Linear models
Vanilla Gradient "vanilla_gradient" Linear models
Gradient x Input "gradient_x_input" Linear models
SmoothGrad "smoothgrad" Linear models
Permutation "permutation" Any model

Compare Any Two Explainers

# SHAP vs LIME
evaluate(model, X, "shap", "lime")

# Vanilla Gradient vs SmoothGrad
evaluate(model, X, "vanilla_gradient", "smoothgrad")

# Gradient x Input vs Integrated Gradients
evaluate(model, X, "gradient_x_input", "integrated_grads")

# Tree models
evaluate(model, X, "shap_tree", "lime_tree")

Metrics

Rank-Biased Overlap (RBO)

Measures how similar two ranked feature lists are. Top features are weighted more heavily than bottom features.

from xai_eval.metrics.rbo import rbo

rbo(['age', 'income', 'debt'], ['age', 'income', 'debt'])  # 1.0
rbo(['age', 'income', 'debt'], ['debt', 'income', 'age'])  # ~0.46

Jaccard@k

Measures what fraction of top-k features both explainers agree on.

from xai_eval.metrics.jaccard import jaccard_at_k

jaccard_at_k(['age', 'income', 'debt'], ['age', 'income', 'debt'], k=3)  # 1.0
jaccard_at_k(['age', 'income', 'debt'], ['age', 'tax',   'loan'],  k=3)  # 0.33

Degradation Simulations

Weight Noise

Simulates model decay by adding gaussian noise to model weights.

from xai_eval.degradation.noise import add_weight_noise

add_weight_noise(model, noise_level=0.1)

Data Drift

Simulates real-world data distribution shift.

from xai_eval.degradation.drift import apply_data_drift

X_drifted = apply_data_drift(X, drift_level=0.1)

Quantization

Simulates model compression by reducing weight precision.

from xai_eval.degradation.quantize import quantize_model

quantize_model(model, bits=8)

Real World Use Case

# A bank wants to audit their loan approval model
from sklearn.ensemble import RandomForestClassifier
from xai_eval.pipeline.evaluator import evaluate

# Their existing model
model = RandomForestClassifier().fit(X_train, y_train)

# Run weekly automated audit
results = evaluate(
    model=model,
    X=X_test,
    explainer_1="shap_tree",
    explainer_2="lime_tree",
    degradation_levels=[0.01, 0.05, 0.1, 0.3, 0.5]
)

# Set automated alert
if results["baseline_rbo"] < 0.5:
    print("WARNING: Explainers are unreliable. Do not use for auditing.")

if results["breaking_point"] is not None:
    print(f"WARNING: Model breaks at degradation level {results['breaking_point']}")

Supported Datasets

Type Examples
Tabular COMPAS, German Credit, loan approval, medical diagnosis
Classification Binary, multiclass
Regression With minor modification
sklearn datasets load_breast_cancer(), load_iris(), load_wine()

Running Tests

pytest tests/ -v
18 passed in 5.43s

Project Structure

xai-eval-sandbox/
├── src/xai_eval/
│   ├── explainers.py       # 10 XAI explainer wrappers
│   ├── metrics/
│   │   ├── rbo.py          # Rank-Biased Overlap
│   │   └── jaccard.py      # Jaccard@k similarity
│   ├── degradation/
│   │   ├── noise.py        # Weight noise injection
│   │   ├── drift.py        # Data drift simulation
│   │   └── quantize.py     # Model quantization
│   └── pipeline/
│       └── evaluator.py    # End-to-end audit pipeline
└── tests/
    ├── test_metrics.py
    └── test_degradation.py

Checklist

  • RBO metric
  • Jaccard@k metric
  • Weight noise degradation
  • Data drift degradation
  • Model quantization degradation
  • 10 explainer wrappers
  • End-to-end audit pipeline
  • 18 passing tests
  • PyPI publish
  • GitHub Actions CI

Contributing

Contributions welcome! Here is what you can add:

  • New explainer wrappers
  • New degradation simulations
  • New disagreement metrics
  • Support for neural networks
  • Visualization tools
git clone https://github.com/hemcharan710-afk/xai-eval-sandbox.git
pip install -e ".[dev]"
pytest tests/ -v

Reference

Krishna, S., Han, T., Gu, A., Wu, S., Jabbari, S., & Lakkaraju, H. (2024). The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective. Transactions on Machine Learning Research. https://openreview.net/forum?id=jESY2WTZCe


License

MIT License — free to use, modify, and distribute.


Author

Hemcharan@hemcharan710-afk

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xai_eval-0.1.0.tar.gz (13.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xai_eval-0.1.0-py3-none-any.whl (11.3 kB view details)

Uploaded Python 3

File details

Details for the file xai_eval-0.1.0.tar.gz.

File metadata

  • Download URL: xai_eval-0.1.0.tar.gz
  • Upload date:
  • Size: 13.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for xai_eval-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f0f672d4338ee82027dc149de08606e2627b68fbb1153ed9b831d4ffa80f3e1e
MD5 bca14636fbd147db26866f43eca58905
BLAKE2b-256 5e92f533cfff621211a5556dbc19b024acac078a4bbcb08df1b90c3cc71b4c3a

See more details on using hashes here.

File details

Details for the file xai_eval-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: xai_eval-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for xai_eval-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 33bf0c203ae85a79ac9656823c2eb9f1fa1e89f72d107c2abdf69c15549bd176
MD5 425f27972d34aaea9ded30fb26ad89a3
BLAKE2b-256 f849b843295723704d7352cb37e5274ab81ce8932a6b0d0f8c3e3012b1a9dfb1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page