Unified, extensible explainability framework supporting 18 XAI methods including LIME, SHAP, LRP, TCAV, GradCAM, and more
Project description
Explainiverse
Explainiverse is a unified, extensible Python framework for Explainable AI (XAI). It provides a standardized interface for 18 state-of-the-art explanation methods across local, global, gradient-based, concept-based, and example-based paradigms, along with comprehensive evaluation metrics for assessing explanation quality.
Key Features
| Feature | Description |
|---|---|
| 18 Explainers | LIME, KernelSHAP, TreeSHAP, Integrated Gradients, DeepLIFT, DeepSHAP, SmoothGrad, Saliency Maps, GradCAM/GradCAM++, LRP, TCAV, Anchors, Counterfactual, Permutation Importance, PDP, ALE, SAGE, ProtoDash |
| 14 Evaluation Metrics | Faithfulness (PGI, PGU, Comprehensiveness, Sufficiency, Correlation, Faithfulness Estimate, Monotonicity, Monotonicity-Nguyen, Pixel Flipping, Region Perturbation) and Stability (RIS, ROS, Lipschitz) |
| Unified API | Consistent BaseExplainer interface with standardized Explanation output |
| Plugin Registry | Filter explainers by scope, model type, data type; automatic recommendations |
| Framework Support | Adapters for scikit-learn and PyTorch (with gradient computation) |
Explainer Coverage
Local Explainers (Instance-Level)
| Method | Type | Reference |
|---|---|---|
| LIME | Perturbation | Ribeiro et al., 2016 |
| KernelSHAP | Perturbation | Lundberg & Lee, 2017 |
| TreeSHAP | Exact (Trees) | Lundberg et al., 2018 |
| Integrated Gradients | Gradient | Sundararajan et al., 2017 |
| DeepLIFT | Gradient | Shrikumar et al., 2017 |
| DeepSHAP | Gradient + Shapley | Lundberg & Lee, 2017 |
| SmoothGrad | Gradient | Smilkov et al., 2017 |
| Saliency Maps | Gradient | Simonyan et al., 2014 |
| GradCAM / GradCAM++ | Gradient (CNN) | Selvaraju et al., 2017 |
| LRP | Decomposition | Bach et al., 2015 |
| TCAV | Concept-Based | Kim et al., 2018 |
| Anchors | Rule-Based | Ribeiro et al., 2018 |
| Counterfactual | Contrastive | Mothilal et al., 2020 |
| ProtoDash | Example-Based | Gurumoorthy et al., 2019 |
Global Explainers (Model-Level)
| Method | Type | Reference |
|---|---|---|
| Permutation Importance | Feature Importance | Breiman, 2001 |
| Partial Dependence (PDP) | Feature Effect | Friedman, 2001 |
| ALE | Feature Effect | Apley & Zhu, 2020 |
| SAGE | Shapley Importance | Covert et al., 2020 |
Evaluation Metrics
Explainiverse includes a comprehensive suite of evaluation metrics based on the XAI literature:
Faithfulness Metrics
| Metric | Description | Reference |
|---|---|---|
| PGI | Prediction Gap on Important features | Petsiuk et al., 2018 |
| PGU | Prediction Gap on Unimportant features | Petsiuk et al., 2018 |
| Comprehensiveness | Drop when removing top-k features | DeYoung et al., 2020 |
| Sufficiency | Prediction using only top-k features | DeYoung et al., 2020 |
| Faithfulness Correlation | Correlation between attribution and impact | Bhatt et al., 2020 |
| Faithfulness Estimate | Correlation of attributions with single-feature perturbation impact | Alvarez-Melis & Jaakkola, 2018 |
| Monotonicity | Sequential feature addition shows monotonic prediction increase | Arya et al., 2019 |
| Monotonicity-Nguyen | Spearman correlation between attributions and feature removal impact | Nguyen & Martinez, 2020 |
| Pixel Flipping | AUC of prediction degradation when removing features by importance | Bach et al., 2015 |
| Region Perturbation | AUC of prediction degradation when perturbing feature regions by importance | Samek et al., 2015 |
Stability Metrics
| Metric | Description | Reference |
|---|---|---|
| RIS | Relative Input Stability | Agarwal et al., 2022 |
| ROS | Relative Output Stability | Agarwal et al., 2022 |
| Lipschitz Estimate | Local Lipschitz continuity | Alvarez-Melis & Jaakkola, 2018 |
Installation
# From PyPI
pip install explainiverse
# With PyTorch support (for gradient-based methods)
pip install explainiverse[torch]
# For development
git clone https://github.com/jemsbhai/explainiverse.git
cd explainiverse
poetry install
Quick Start
Basic Usage with Registry
from explainiverse import default_registry, SklearnAdapter
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
# Train a model
iris = load_iris()
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(iris.data, iris.target)
# Wrap with adapter
adapter = SklearnAdapter(model, class_names=iris.target_names.tolist())
# List all available explainers
print(default_registry.list_explainers())
# ['lime', 'shap', 'treeshap', 'integrated_gradients', 'deeplift', 'deepshap',
# 'smoothgrad', 'saliency', 'gradcam', 'lrp', 'tcav', 'anchors', 'counterfactual',
# 'protodash', 'permutation_importance', 'partial_dependence', 'ale', 'sage']
# Create an explainer via registry
explainer = default_registry.create(
"lime",
model=adapter,
training_data=iris.data,
feature_names=iris.feature_names.tolist(),
class_names=iris.target_names.tolist()
)
# Generate explanation
explanation = explainer.explain(iris.data[0])
print(explanation.explanation_data["feature_attributions"])
Filter and Recommend Explainers
# Filter by criteria
local_explainers = default_registry.filter(scope="local", data_type="tabular")
neural_explainers = default_registry.filter(model_type="neural")
image_explainers = default_registry.filter(data_type="image")
# Get recommendations
recommendations = default_registry.recommend(
model_type="neural",
data_type="tabular",
scope_preference="local",
max_results=5
)
Gradient-Based Explainers (PyTorch)
Integrated Gradients
from explainiverse import PyTorchAdapter
from explainiverse.explainers.gradient import IntegratedGradientsExplainer
import torch.nn as nn
# Define and wrap model
model = nn.Sequential(
nn.Linear(10, 64), nn.ReLU(),
nn.Linear(64, 32), nn.ReLU(),
nn.Linear(32, 3)
)
adapter = PyTorchAdapter(model, task="classification", class_names=["A", "B", "C"])
# Create explainer
explainer = IntegratedGradientsExplainer(
model=adapter,
feature_names=[f"feature_{i}" for i in range(10)],
class_names=["A", "B", "C"],
n_steps=50,
method="riemann_trapezoid"
)
# Explain with convergence check
explanation = explainer.explain(X[0], return_convergence_delta=True)
print(f"Attributions: {explanation.explanation_data['feature_attributions']}")
print(f"Convergence δ: {explanation.explanation_data['convergence_delta']:.6f}")
Layer-wise Relevance Propagation (LRP)
from explainiverse.explainers.gradient import LRPExplainer
# LRP - Decomposition-based attribution with conservation property
explainer = LRPExplainer(
model=adapter,
feature_names=feature_names,
class_names=class_names,
rule="epsilon", # Propagation rule: epsilon, gamma, alpha_beta, z_plus, composite
epsilon=1e-6 # Stabilization constant
)
# Basic explanation
explanation = explainer.explain(X[0], target_class=0)
print(explanation.explanation_data["feature_attributions"])
# Verify conservation property (sum of attributions ≈ target output)
explanation = explainer.explain(X[0], return_convergence_delta=True)
print(f"Conservation delta: {explanation.explanation_data['convergence_delta']:.6f}")
# Compare different LRP rules
comparison = explainer.compare_rules(X[0], rules=["epsilon", "gamma", "z_plus"])
for rule, result in comparison.items():
print(f"{rule}: top feature = {result['top_feature']}")
# Layer-wise relevance analysis
layer_result = explainer.explain_with_layer_relevances(X[0])
for layer, relevances in layer_result["layer_relevances"].items():
print(f"{layer}: sum = {sum(relevances):.4f}")
# Composite rules: different rules for different layers
explainer_composite = LRPExplainer(
model=adapter,
feature_names=feature_names,
class_names=class_names,
rule="composite"
)
explainer_composite.set_composite_rule({
0: "z_plus", # Input layer: focus on what's present
2: "epsilon", # Middle layers: balanced
4: "epsilon" # Output layer
})
explanation = explainer_composite.explain(X[0])
LRP Propagation Rules:
| Rule | Description | Use Case |
|---|---|---|
epsilon |
Adds stabilization constant | General purpose (default) |
gamma |
Enhances positive contributions | Image classification |
alpha_beta |
Separates pos/neg (α-β=1) | Fine-grained control |
z_plus |
Only positive weights | Input layers, what's present |
composite |
Different rules per layer | Best practice for deep nets |
Supported Layers:
- Linear, Conv2d
- BatchNorm1d, BatchNorm2d
- ReLU, LeakyReLU, ELU, Tanh, Sigmoid, GELU
- MaxPool2d, AvgPool2d, AdaptiveAvgPool2d
- Flatten, Dropout
DeepLIFT and DeepSHAP
from explainiverse.explainers.gradient import DeepLIFTExplainer, DeepLIFTShapExplainer
# DeepLIFT - Fast reference-based attributions
deeplift = DeepLIFTExplainer(
model=adapter,
feature_names=feature_names,
class_names=class_names,
baseline=None # Uses zero baseline by default
)
explanation = deeplift.explain(X[0])
# DeepSHAP - DeepLIFT averaged over background samples
deepshap = DeepLIFTShapExplainer(
model=adapter,
feature_names=feature_names,
class_names=class_names,
background_data=X_train[:100]
)
explanation = deepshap.explain(X[0])
Saliency Maps
from explainiverse.explainers.gradient import SaliencyExplainer
# Saliency Maps - simplest and fastest gradient method
explainer = SaliencyExplainer(
model=adapter,
feature_names=feature_names,
class_names=class_names,
absolute_value=True # Default: absolute gradient magnitudes
)
# Standard saliency (absolute gradients)
explanation = explainer.explain(X[0], method="saliency")
# Input × Gradient (gradient scaled by input values)
explanation = explainer.explain(X[0], method="input_times_gradient")
# Signed saliency (keep gradient direction)
explainer_signed = SaliencyExplainer(
model=adapter,
feature_names=feature_names,
class_names=class_names,
absolute_value=False
)
explanation = explainer_signed.explain(X[0])
# Compare all variants
variants = explainer.compute_all_variants(X[0])
print(variants["saliency_absolute"])
print(variants["saliency_signed"])
print(variants["input_times_gradient"])
SmoothGrad
from explainiverse.explainers.gradient import SmoothGradExplainer
# SmoothGrad - Noise-averaged gradients for smoother saliency
explainer = SmoothGradExplainer(
model=adapter,
feature_names=feature_names,
class_names=class_names,
n_samples=50,
noise_scale=0.15,
noise_type="gaussian" # or "uniform"
)
# Standard SmoothGrad
explanation = explainer.explain(X[0], method="smoothgrad")
# SmoothGrad-Squared (sharper attributions)
explanation = explainer.explain(X[0], method="smoothgrad_squared")
# VarGrad (variance of gradients)
explanation = explainer.explain(X[0], method="vargrad")
# With absolute values
explanation = explainer.explain(X[0], absolute_value=True)
GradCAM for CNNs
from explainiverse.explainers.gradient import GradCAMExplainer
# For CNN models
adapter = PyTorchAdapter(cnn_model, task="classification", class_names=class_names)
explainer = GradCAMExplainer(
model=adapter,
target_layer="layer4", # Last conv layer
class_names=class_names,
method="gradcam++" # or "gradcam"
)
explanation = explainer.explain(image)
heatmap = explanation.explanation_data["heatmap"]
overlay = explainer.get_overlay(original_image, heatmap, alpha=0.5)
TCAV (Concept-Based Explanations)
from explainiverse.explainers.gradient import TCAVExplainer
# For neural network models with concept examples
adapter = PyTorchAdapter(model, task="classification", class_names=class_names)
# Create TCAV explainer targeting a specific layer
explainer = TCAVExplainer(
model=adapter,
layer_name="layer3", # Target layer for concept analysis
class_names=class_names
)
# Learn a concept from examples (e.g., "striped" pattern)
explainer.learn_concept(
concept_name="striped",
concept_examples=striped_images, # Images with stripes
negative_examples=random_images, # Random images without stripes
min_accuracy=0.6 # Minimum CAV classifier accuracy
)
# Compute TCAV score: fraction of inputs where concept positively influences prediction
tcav_score = explainer.compute_tcav_score(
test_inputs=test_images,
target_class=0, # e.g., "zebra"
concept_name="striped"
)
print(f"TCAV score: {tcav_score:.3f}") # >0.5 means concept positively influences class
# Statistical significance testing against random concepts
result = explainer.statistical_significance_test(
test_inputs=test_images,
target_class=0,
concept_name="striped",
n_random=10,
negative_examples=random_images
)
print(f"p-value: {result['p_value']:.4f}, significant: {result['significant']}")
# Full explanation with multiple concepts
explanation = explainer.explain(
test_inputs=test_images,
target_class=0,
run_significance_test=True
)
print(explanation.explanation_data["tcav_scores"])
Example-Based Explanations
ProtoDash
from explainiverse.explainers.example_based import ProtoDashExplainer
explainer = ProtoDashExplainer(
model=adapter,
training_data=X_train,
feature_names=feature_names,
n_prototypes=5,
kernel="rbf",
gamma=0.1
)
explanation = explainer.explain(X_test[0])
print(explanation.explanation_data["prototype_indices"])
print(explanation.explanation_data["prototype_weights"])
Evaluation Metrics
Faithfulness Evaluation
from explainiverse.evaluation import (
compute_pgi, compute_pgu,
compute_comprehensiveness, compute_sufficiency,
compute_faithfulness_correlation
)
# PGI - Higher is better (important features affect predictions)
pgi = compute_pgi(
model=adapter,
instance=X[0],
attributions=attributions,
feature_names=feature_names,
top_k=3
)
# PGU - Lower is better (unimportant features don't affect predictions)
pgu = compute_pgu(
model=adapter,
instance=X[0],
attributions=attributions,
feature_names=feature_names,
top_k=3
)
# Comprehensiveness - Higher is better
comp = compute_comprehensiveness(
model=adapter,
instance=X[0],
attributions=attributions,
feature_names=feature_names,
top_k_values=[1, 2, 3, 5]
)
# Sufficiency - Lower is better
suff = compute_sufficiency(
model=adapter,
instance=X[0],
attributions=attributions,
feature_names=feature_names,
top_k_values=[1, 2, 3, 5]
)
# Faithfulness Correlation
corr = compute_faithfulness_correlation(
model=adapter,
instance=X[0],
attributions=attributions,
feature_names=feature_names
)
Stability Evaluation
from explainiverse.evaluation import (
compute_ris, compute_ros, compute_lipschitz_estimate
)
# RIS - Relative Input Stability (lower is better)
ris = compute_ris(
explainer=explainer,
instance=X[0],
n_perturbations=10,
perturbation_scale=0.1
)
# ROS - Relative Output Stability (lower is better)
ros = compute_ros(
model=adapter,
explainer=explainer,
instance=X[0],
n_perturbations=10,
perturbation_scale=0.1
)
# Lipschitz Estimate (lower is better)
lipschitz = compute_lipschitz_estimate(
explainer=explainer,
instance=X[0],
n_perturbations=20,
perturbation_scale=0.1
)
Global Explainers
from explainiverse.explainers import (
PermutationImportanceExplainer,
PartialDependenceExplainer,
ALEExplainer,
SAGEExplainer
)
# Permutation Importance
perm_imp = PermutationImportanceExplainer(
model=adapter,
X=X_test,
y=y_test,
feature_names=feature_names,
n_repeats=10
)
explanation = perm_imp.explain()
# Partial Dependence Plot
pdp = PartialDependenceExplainer(
model=adapter,
X=X_train,
feature_names=feature_names
)
explanation = pdp.explain(feature="feature_0", grid_resolution=50)
# ALE (handles correlated features)
ale = ALEExplainer(
model=adapter,
X=X_train,
feature_names=feature_names
)
explanation = ale.explain(feature="feature_0", n_bins=20)
# SAGE (global Shapley importance)
sage = SAGEExplainer(
model=adapter,
X=X_train,
y=y_train,
feature_names=feature_names,
n_permutations=512
)
explanation = sage.explain()
Multi-Explainer Comparison
from explainiverse import ExplanationSuite
suite = ExplanationSuite(
model=adapter,
explainer_configs=[
("lime", {"training_data": X_train, "feature_names": feature_names, "class_names": class_names}),
("shap", {"background_data": X_train[:50], "feature_names": feature_names, "class_names": class_names}),
("treeshap", {"feature_names": feature_names, "class_names": class_names}),
]
)
results = suite.run(X_test[0])
suite.compare()
Custom Explainer Registration
from explainiverse import default_registry, ExplainerMeta, BaseExplainer, Explanation
@default_registry.register_decorator(
name="my_explainer",
meta=ExplainerMeta(
scope="local",
model_types=["any"],
data_types=["tabular"],
task_types=["classification", "regression"],
description="My custom explainer",
paper_reference="Author et al., 2024",
complexity="O(n)",
requires_training_data=False,
supports_batching=True
)
)
class MyExplainer(BaseExplainer):
def __init__(self, model, feature_names, **kwargs):
super().__init__(model)
self.feature_names = feature_names
def explain(self, instance, **kwargs):
# Your implementation
attributions = self._compute_attributions(instance)
return Explanation(
explainer_name="MyExplainer",
target_class="output",
explanation_data={"feature_attributions": attributions}
)
Architecture
explainiverse/
├── core/
│ ├── explainer.py # BaseExplainer abstract class
│ ├── explanation.py # Unified Explanation container
│ └── registry.py # ExplainerRegistry with metadata
├── adapters/
│ ├── sklearn_adapter.py
│ └── pytorch_adapter.py # With gradient support
├── explainers/
│ ├── attribution/ # LIME, SHAP, TreeSHAP
│ ├── gradient/ # IG, DeepLIFT, DeepSHAP, SmoothGrad, Saliency, GradCAM, LRP, TCAV
│ ├── rule_based/ # Anchors
│ ├── counterfactual/ # DiCE-style
│ ├── global_explainers/ # Permutation, PDP, ALE, SAGE
│ └── example_based/ # ProtoDash
├── evaluation/
│ ├── faithfulness.py # PGI, PGU, Comprehensiveness, Sufficiency
│ └── stability.py # RIS, ROS, Lipschitz
└── engine/
└── suite.py # Multi-explainer comparison
Running Tests
# Run all tests
poetry run pytest
# Run with coverage
poetry run pytest --cov=explainiverse --cov-report=html
# Run specific test file
poetry run pytest tests/test_lrp.py -v
# Run specific test class
poetry run pytest tests/test_lrp.py::TestLRPConv2d -v
Roadmap
Completed ✅
- Core framework (BaseExplainer, Explanation, Registry)
- Perturbation methods: LIME, KernelSHAP, TreeSHAP
- Gradient methods: Integrated Gradients, DeepLIFT, DeepSHAP, SmoothGrad, Saliency Maps, GradCAM/GradCAM++
- Decomposition methods: Layer-wise Relevance Propagation (LRP) with ε, γ, αβ, z⁺, composite rules
- Concept-based: TCAV (Testing with Concept Activation Vectors)
- Rule-based: Anchors
- Counterfactual: DiCE-style
- Global: Permutation Importance, PDP, ALE, SAGE
- Example-based: ProtoDash
- Evaluation: Faithfulness metrics (PGI, PGU, Comprehensiveness, Sufficiency, Correlation)
- Evaluation: Stability metrics (RIS, ROS, Lipschitz)
- PyTorch adapter with gradient support
In Progress 🔄
- Evaluation metrics expansion - Adding 42 more metrics across 7 categories to exceed Quantus (37 metrics)
- Phase 1: Faithfulness (+9 metrics) - 4/12 complete
- Phase 2: Robustness (+7 metrics)
- Phase 3: Localisation (+8 metrics)
- Phase 4: Complexity (+4 metrics)
- Phase 5: Randomisation (+5 metrics)
- Phase 6: Axiomatic (+4 metrics)
- Phase 7: Fairness (+4 metrics)
Planned 📋
- Attention-based explanations (for Transformers)
- TensorFlow/Keras adapter
- Interactive visualization dashboard
- Explanation caching and serialization
- Distributed computation support
Citation
If you use Explainiverse in your research, please cite:
@software{explainiverse2025,
title = {Explainiverse: A Unified Framework for Explainable AI},
author = {Syed, Muntaser},
year = {2025},
url = {https://github.com/jemsbhai/explainiverse},
version = {0.8.4}
}
Contributing
Contributions are welcome! Please see our Contributing Guide for details.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Write tests for your changes
- Ensure all tests pass (
poetry run pytest) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
MIT License - see LICENSE for details.
Acknowledgments
Explainiverse builds upon the foundational work of many researchers in the XAI community. We thank the authors of LIME, SHAP, Integrated Gradients, DeepLIFT, LRP, GradCAM, TCAV, Anchors, DiCE, ALE, SAGE, and ProtoDash for their contributions to interpretable machine learning.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file explainiverse-0.8.5.tar.gz.
File metadata
- Download URL: explainiverse-0.8.5.tar.gz
- Upload date:
- Size: 102.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.12.2 Windows/11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6c139ae551fbd98c4ad8c33f1063f9add472a771ae91f96b2a867b57a362dd59
|
|
| MD5 |
b641a5ff84759d08216f6db802dda9bf
|
|
| BLAKE2b-256 |
82f4d0c04f70406d1fbca4af401e45da309782b0f02b2b5f378059b19fdc77de
|
File details
Details for the file explainiverse-0.8.5-py3-none-any.whl.
File metadata
- Download URL: explainiverse-0.8.5-py3-none-any.whl
- Upload date:
- Size: 121.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.12.2 Windows/11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7cb38e1e098b3487d0fb0a361fdd5aa840b646a331d092f488ecfd6c52a90f99
|
|
| MD5 |
d9fabe1c33e61121c97322d3dfdfa33a
|
|
| BLAKE2b-256 |
90c0de1930d8f091b32ea6deb03be0d0e78784ab4099f34e687dce424623093f
|