A stress-testing framework for evaluating XAI explanation divergence.
Project description
xai-eval: XAI Explanation Disagreement Auditing Library
A Python library for quantifying and stress-testing disagreement between Explainable AI (XAI) methods. Based on the disagreement framework proposed in Krishna et al. (2024), published in Transactions on Machine Learning Research.
The Problem
When you ask two explanation methods why your model made a decision:
- LIME says → "Low income was the top reason"
- SHAP says → "High debt was the top reason"
Same model. Same decision. Two different answers.
This library gives engineers a standardized way to measure, track, and stress-test this disagreement — and find the exact point where explanations become too unstable to trust.
Installation
git clone https://github.com/hemcharan710-afk/xai-eval-sandbox.git
cd xai-eval-sandbox
pip install -e ".[dev]"
Quick Start
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification
from xai_eval.pipeline.evaluator import evaluate
# Train a model
X, y = make_classification(n_samples=1000, n_features=10, random_state=42)
model = LogisticRegression(max_iter=1000).fit(X, y)
# Run a full XAI audit
results = evaluate(
model=model,
X=X,
explainer_1="shap",
explainer_2="lime",
degradation_levels=[0.01, 0.05, 0.1, 0.3, 0.5]
)
Output:
==============================================================
XAI AUDIT REPORT
==============================================================
Explainer 1 : shap
Explainer 2 : lime
Baseline RBO : 0.972
Baseline Jacc : 1.000
--------------------------------------------------------------
Degradation RBO Score Jaccard Status
--------------------------------------------------------------
0.01 0.965 1.000 ✅ Safe
0.05 0.931 0.500 ✅ Safe
0.10 1.000 1.000 ✅ Safe
0.30 1.000 1.000 ✅ Safe
0.50 0.988 1.000 ✅ Safe
--------------------------------------------------------------
✅ Explainers stable across all levels. Safe to trust.
==============================================================
Interpreting Results
| RBO Score | Meaning |
|---|---|
| 0.8 - 1.0 | ✅ Safe — explainers strongly agree |
| 0.5 - 0.8 | ⚠️ Warning — explainers drifting apart |
| 0.0 - 0.5 | 🚨 Danger — do not trust either explainer |
Supported Explainers
| Explainer | String Name | Works With |
|---|---|---|
| Linear SHAP | "shap" |
Linear models |
| Tree SHAP | "shap_tree" |
Tree models |
| Kernel SHAP | "kernel_shap" |
Any model |
| LIME | "lime" |
Any model |
| LIME (tree) | "lime_tree" |
Tree models |
| Integrated Gradients | "integrated_grads" |
Linear models |
| Vanilla Gradient | "vanilla_gradient" |
Linear models |
| Gradient x Input | "gradient_x_input" |
Linear models |
| SmoothGrad | "smoothgrad" |
Linear models |
| Permutation | "permutation" |
Any model |
Compare Any Two Explainers
# SHAP vs LIME
evaluate(model, X, "shap", "lime")
# Vanilla Gradient vs SmoothGrad
evaluate(model, X, "vanilla_gradient", "smoothgrad")
# Gradient x Input vs Integrated Gradients
evaluate(model, X, "gradient_x_input", "integrated_grads")
# Tree models
evaluate(model, X, "shap_tree", "lime_tree")
Metrics
Rank-Biased Overlap (RBO)
Measures how similar two ranked feature lists are. Top features are weighted more heavily than bottom features.
from xai_eval.metrics.rbo import rbo
rbo(['age', 'income', 'debt'], ['age', 'income', 'debt']) # 1.0
rbo(['age', 'income', 'debt'], ['debt', 'income', 'age']) # ~0.46
Jaccard@k
Measures what fraction of top-k features both explainers agree on.
from xai_eval.metrics.jaccard import jaccard_at_k
jaccard_at_k(['age', 'income', 'debt'], ['age', 'income', 'debt'], k=3) # 1.0
jaccard_at_k(['age', 'income', 'debt'], ['age', 'tax', 'loan'], k=3) # 0.33
Degradation Simulations
Weight Noise
Simulates model decay by adding gaussian noise to model weights.
from xai_eval.degradation.noise import add_weight_noise
add_weight_noise(model, noise_level=0.1)
Data Drift
Simulates real-world data distribution shift.
from xai_eval.degradation.drift import apply_data_drift
X_drifted = apply_data_drift(X, drift_level=0.1)
Quantization
Simulates model compression by reducing weight precision.
from xai_eval.degradation.quantize import quantize_model
quantize_model(model, bits=8)
Real World Use Case
# A bank wants to audit their loan approval model
from sklearn.ensemble import RandomForestClassifier
from xai_eval.pipeline.evaluator import evaluate
# Their existing model
model = RandomForestClassifier().fit(X_train, y_train)
# Run weekly automated audit
results = evaluate(
model=model,
X=X_test,
explainer_1="shap_tree",
explainer_2="lime_tree",
degradation_levels=[0.01, 0.05, 0.1, 0.3, 0.5]
)
# Set automated alert
if results["baseline_rbo"] < 0.5:
print("WARNING: Explainers are unreliable. Do not use for auditing.")
if results["breaking_point"] is not None:
print(f"WARNING: Model breaks at degradation level {results['breaking_point']}")
Supported Datasets
| Type | Examples |
|---|---|
| Tabular | COMPAS, German Credit, loan approval, medical diagnosis |
| Classification | Binary, multiclass |
| Regression | With minor modification |
| sklearn datasets | load_breast_cancer(), load_iris(), load_wine() |
Running Tests
pytest tests/ -v
18 passed in 5.43s
Project Structure
xai-eval-sandbox/
├── src/xai_eval/
│ ├── explainers.py # 10 XAI explainer wrappers
│ ├── metrics/
│ │ ├── rbo.py # Rank-Biased Overlap
│ │ └── jaccard.py # Jaccard@k similarity
│ ├── degradation/
│ │ ├── noise.py # Weight noise injection
│ │ ├── drift.py # Data drift simulation
│ │ └── quantize.py # Model quantization
│ └── pipeline/
│ └── evaluator.py # End-to-end audit pipeline
└── tests/
├── test_metrics.py
└── test_degradation.py
Checklist
- RBO metric
- Jaccard@k metric
- Weight noise degradation
- Data drift degradation
- Model quantization degradation
- 10 explainer wrappers
- End-to-end audit pipeline
- 18 passing tests
- PyPI publish
- GitHub Actions CI
Contributing
Contributions welcome! Here is what you can add:
- New explainer wrappers
- New degradation simulations
- New disagreement metrics
- Support for neural networks
- Visualization tools
git clone https://github.com/hemcharan710-afk/xai-eval-sandbox.git
pip install -e ".[dev]"
pytest tests/ -v
Reference
Krishna, S., Han, T., Gu, A., Wu, S., Jabbari, S., & Lakkaraju, H. (2024). The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective. Transactions on Machine Learning Research. https://openreview.net/forum?id=jESY2WTZCe
License
MIT License — free to use, modify, and distribute.
Author
Hemcharan — @hemcharan710-afk
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file xai_eval-0.1.0.tar.gz.
File metadata
- Download URL: xai_eval-0.1.0.tar.gz
- Upload date:
- Size: 13.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f0f672d4338ee82027dc149de08606e2627b68fbb1153ed9b831d4ffa80f3e1e
|
|
| MD5 |
bca14636fbd147db26866f43eca58905
|
|
| BLAKE2b-256 |
5e92f533cfff621211a5556dbc19b024acac078a4bbcb08df1b90c3cc71b4c3a
|
File details
Details for the file xai_eval-0.1.0-py3-none-any.whl.
File metadata
- Download URL: xai_eval-0.1.0-py3-none-any.whl
- Upload date:
- Size: 11.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
33bf0c203ae85a79ac9656823c2eb9f1fa1e89f72d107c2abdf69c15549bd176
|
|
| MD5 |
425f27972d34aaea9ded30fb26ad89a3
|
|
| BLAKE2b-256 |
f849b843295723704d7352cb37e5274ab81ce8932a6b0d0f8c3e3012b1a9dfb1
|