Shapley Interactions for Machine Learning
Project description
shapiq: Shapley Interactions for Machine Learning
An interaction may speak more than a thousand main effects.
Shapley Interaction Quantification (shapiq
) is a Python package for (1) approximating any-order Shapley interactions, (2) benchmarking game-theoretical algorithms for machine learning, (3) explaining feature interactions of model predictions. shapiq
extends the well-known shap package for both researchers working on game theory in machine learning, as well as the end-users explaining models. SHAP-IQ extends individual Shapley values by quantifying the synergy effect between entities (aka players in the jargon of game theory) like explanatory features, data points, or weak learners in ensemble models. Synergies between players give a more comprehensive view of machine learning models.
🛠️ Install
shapiq
is intended to work with Python 3.9 and above. Installation can be done via pip
:
pip install shapiq
⭐ Quickstart
You can explain your model with shapiq.explainer
and visualize Shapley interactions with shapiq.plot
.
If you are interested in the underlying game theoretic algorithms, then check out the shapiq.approximator
and shapiq.games
modules.
📈 Compute any-order feature interactions
Explain your models with Shapley interaction values like the k-SII values:
import shapiq
# load data
X, y = shapiq.load_california_housing(to_numpy=True)
# train a model
from sklearn.ensemble import RandomForestRegressor
model = RandomForestRegressor()
model.fit(X, y)
# set up an explainer with k-SII interaction values up to order 4
explainer = shapiq.TabularExplainer(
model=model,
data=X,
index="k-SII",
max_order=4
)
# explain the model's prediction for the first sample
interaction_values = explainer.explain(X[0], budget=256)
# analyse interaction values
print(interaction_values)
>> InteractionValues(
>> index=k-SII, max_order=4, min_order=0, estimated=False,
>> estimation_budget=256, n_players=8, baseline_value=2.07282292,
>> Top 10 interactions:
>> (0,): 1.696969079 # attribution of feature 0
>> (0, 5): 0.4847876
>> (0, 1): 0.4494288 # interaction between features 0 & 1
>> (0, 6): 0.4477677
>> (1, 5): 0.3750034
>> (4, 5): 0.3468325
>> (0, 3, 6): -0.320 # interaction between features 0 & 3 & 6
>> (2, 3, 6): -0.329
>> (0, 1, 5): -0.363
>> (6,): -0.56358890
>> )
📊 Visualize feature interactions
A handy way of visualizing interaction scores up to order 2 are network plots. You can see an example of such a plot below. The nodes represent feature attributions and the edges represent the interactions between features. The strength and size of the nodes and edges are proportional to the absolute value of attributions and interactions, respectively.
shapiq.network_plot(
first_order_values=interaction_values.get_n_order_values(1),
second_order_values=interaction_values.get_n_order_values(2)
)
# or use
interaction_values.plot_network()
The pseudo-code above can produce the following plot (here also an image is added):
📖 Documentation with tutorials
The documentation of shapiq
can be found at https://shapiq.readthedocs.io
💬 Citation
If you enjoy using the shapiq
package, please consider citing our NeurIPS paper:
@inproceedings{muschalik2024shapiq,
title = {shapiq: Shapley Interactions for Machine Learning},
author = {Maximilian Muschalik and Hubert Baniecki and Fabian Fumagalli and
Patrick Kolpaczki and Barbara Hammer and Eyke H\"{u}llermeier},
booktitle = {The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
year = {2024},
url = {https://openreview.net/forum?id=knxGmi6SJi}
}
Changelog
v1.1.1 (2024-11-13)
Improvements and Ease of Use
- adds a
class_index
parameter toTabularExplainer
andExplainer
to specify the class index to be explained for classification models #271 (renamesclass_label
parameter in TreeExplainer toclass_index
) - adds support for
PyTorch
models toExplainer
#272 - adds new tests comparing
shapiq
outputs for SVs with alues computed withshap
- adds new tests for checking
shapiq
explainers with different types of models
Bug Fixes
- fixes a bug that
RandomForestClassifier
models were not working with theTreeExplainer
#273
v1.1.0 (2024-11-07)
New Features and Improvements
- adds computation of the Egalitarian Core (
EC
) and Egalitarian Least-Core (ELC
) to theExactComputer
#182 - adds
waterfall_plot
#34 that visualizes the contributions of features to the model prediction - adds
BaselineImputer
#107 which is now responsible for handling thesample_replacements
parameter. Added a DeprecationWarning for the parameter inMarginalImputer
, which will be removed in the next release. - adds
joint_marginal_distribution
parameter toMarginalImputer
with default valueTrue
#261 - renames explanation graph to
si_graph
get_n_order
now has optional lower/upper limits for the order- computing metrics for benchmarking now tries to resolve not-matching interaction indices and will throw a warning instead of a ValueError #179
- add a legend to benchmark plots #170
- refactored the
shapiq.games.benchmark
module into a separateshapiq.benchmark
module by moving all but the benchmark games into the new module. This closes #169 and makes benchmarking more flexible and convenient. - a
shapiq.Game
can now be called more intuitively with coalitions data types (tuples of int or str) and also allows to addplayer_names
to the game at initialization #183 - improve tests across the package
Documentation
- adds a notebook showing how to use custom tree models with the
TreeExplainer
#66 - adds a notebook show how to use the
shapiq.Game
API to create custom games #184 - adds a notebook showing hot to visualize interactions #252
- adds a notebook showing how to compute Shapley values with
shapiq
#193 - adds a notebook for conducting data valuation #190
- adds a notebook showcasing introducing the Core and how to compute it with
shapiq
#191
Bug Fixes
- fixes a bug with SIs not adding up to the model prediction because of wrong values in the empty set #264
- fixes a bug that
TreeExplainer
did not have the correct baseline_value when using XGBoost models #250 - fixes the force plot not showing and its baseline value
v1.0.1 (2024-06-05)
- add
max_order=1
toTabularExplainer
andTreeExplainer
- fix
TreeExplainer.explain_X(..., n_jobs=2, random_state=0)
v1.0.0 (2024-06-04)
Major release of the shapiq
Python package including (among others):
approximator
module implements over 10 approximators of Shapley values and interaction indices.exact
module implements a computer for over 10 game theoretic concepts like interaction indices or generalized values.games
module implements over 10 application benchmarks for the approximators.explainer
module includes aTabularExplainer
andTreeExplainer
for any-order feature interactions of machine learning model predictions.interaction_values
module implements a data class to store and analyze interaction values.plot
module allows visualizing interaction values.datasets
module loads datasets for testing and examples.
Documentation of shapiq
with tutorials and API reference is available at https://shapiq.readthedocs.io
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file shapiq-1.1.1.tar.gz
.
File metadata
- Download URL: shapiq-1.1.1.tar.gz
- Upload date:
- Size: 167.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c387ac43578f615fcdce78a483c9878687948935969b8b4467c95c454435cfe5 |
|
MD5 | 1f257c4cabae8c8a9e2261b739e5b23d |
|
BLAKE2b-256 | 49f62ff07243e1e49385fb1ba9b7cd31bb44f8d8c1dddd01f1115c98b8f98faf |
File details
Details for the file shapiq-1.1.1-py3-none-any.whl
.
File metadata
- Download URL: shapiq-1.1.1-py3-none-any.whl
- Upload date:
- Size: 213.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6db684a5b0d5739920856c8228a98ff05654418b3df89f5944c87be7b6a1e575 |
|
MD5 | 7f978d7f2d28f01ce8b07f66d1f6e866 |
|
BLAKE2b-256 | f20f174147dd61241e478291006e66e96a512da32161659591fe4f52f3b4bc0a |