Skip to main content

Shapley Interactions for Machine Learning

Project description

shapiq: Shapley Interactions for Machine Learning shapiq_logo

License Coverage Status Tests Read the Docs

PyPI Version PyPI status PePy

Code Style

An interaction may speak more than a thousand main effects.

Shapley Interaction Quantification (shapiq) is a Python package for (1) approximating any-order Shapley interactions, (2) benchmarking game-theoretical algorithms for machine learning, (3) explaining feature interactions of model predictions. shapiq extends the well-known shap package for both researchers working on game theory in machine learning, as well as the end-users explaining models. SHAP-IQ extends individual Shapley values by quantifying the synergy effect between entities (aka players in the jargon of game theory) like explanatory features, data points, or weak learners in ensemble models. Synergies between players give a more comprehensive view of machine learning models.

🛠️ Install

shapiq is intended to work with Python 3.9 and above. Installation can be done via pip:

pip install shapiq

⭐ Quickstart

You can explain your model with shapiq.explainer and visualize Shapley interactions with shapiq.plot. If you are interested in the underlying game theoretic algorithms, then check out the shapiq.approximator and shapiq.games modules.

📈 Compute any-order feature interactions

Explain your models with Shapley interaction values like the k-SII values:

import shapiq
# load data
X, y = shapiq.load_california_housing(to_numpy=True)
# train a model
from sklearn.ensemble import RandomForestRegressor
model = RandomForestRegressor()
model.fit(X, y)
# set up an explainer with k-SII interaction values up to order 4
explainer = shapiq.TabularExplainer(
    model=model,
    data=X,
    index="k-SII",
    max_order=4
)
# explain the model's prediction for the first sample
interaction_values = explainer.explain(X[0], budget=256)
# analyse interaction values
print(interaction_values)

>> InteractionValues(
>>     index=k-SII, max_order=4, min_order=0, estimated=False,
>>     estimation_budget=256, n_players=8, baseline_value=2.07282292,
>>     Top 10 interactions:
>>         (0,): 1.696969079  # attribution of feature 0
>>         (0, 5): 0.4847876
>>         (0, 1): 0.4494288  # interaction between features 0 & 1
>>         (0, 6): 0.4477677
>>         (1, 5): 0.3750034
>>         (4, 5): 0.3468325
>>         (0, 3, 6): -0.320  # interaction between features 0 & 3 & 6
>>         (2, 3, 6): -0.329
>>         (0, 1, 5): -0.363
>>         (6,): -0.56358890
>> )

📊 Visualize feature interactions

A handy way of visualizing interaction scores up to order 2 are network plots. You can see an example of such a plot below. The nodes represent feature attributions and the edges represent the interactions between features. The strength and size of the nodes and edges are proportional to the absolute value of attributions and interactions, respectively.

shapiq.network_plot(
    first_order_values=interaction_values.get_n_order_values(1),
    second_order_values=interaction_values.get_n_order_values(2)
)
# or use
interaction_values.plot_network()

The pseudo-code above can produce the following plot (here also an image is added):

network_plot_example

📖 Documentation with tutorials

The documentation of shapiq can be found at https://shapiq.readthedocs.io

💬 Citation

If you enjoy using the shapiq package, please consider citing our NeurIPS paper:

@inproceedings{muschalik2024shapiq,
  title     = {shapiq: Shapley Interactions for Machine Learning},
  author    = {Maximilian Muschalik and Hubert Baniecki and Fabian Fumagalli and
               Patrick Kolpaczki and Barbara Hammer and Eyke H\"{u}llermeier},
  booktitle = {The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
  year      = {2024},
  url       = {https://openreview.net/forum?id=knxGmi6SJi}
}

Changelog

v1.1.1 (2024-11-13)

Improvements and Ease of Use

  • adds a class_index parameter to TabularExplainer and Explainer to specify the class index to be explained for classification models #271 (renames class_label parameter in TreeExplainer to class_index)
  • adds support for PyTorch models to Explainer #272
  • adds new tests comparing shapiq outputs for SVs with alues computed with shap
  • adds new tests for checking shapiq explainers with different types of models

Bug Fixes

  • fixes a bug that RandomForestClassifier models were not working with the TreeExplainer #273

v1.1.0 (2024-11-07)

New Features and Improvements

  • adds computation of the Egalitarian Core (EC) and Egalitarian Least-Core (ELC) to the ExactComputer #182
  • adds waterfall_plot #34 that visualizes the contributions of features to the model prediction
  • adds BaselineImputer #107 which is now responsible for handling the sample_replacements parameter. Added a DeprecationWarning for the parameter in MarginalImputer, which will be removed in the next release.
  • adds joint_marginal_distribution parameter to MarginalImputer with default value True #261
  • renames explanation graph to si_graph
  • get_n_order now has optional lower/upper limits for the order
  • computing metrics for benchmarking now tries to resolve not-matching interaction indices and will throw a warning instead of a ValueError #179
  • add a legend to benchmark plots #170
  • refactored the shapiq.games.benchmark module into a separate shapiq.benchmark module by moving all but the benchmark games into the new module. This closes #169 and makes benchmarking more flexible and convenient.
  • a shapiq.Game can now be called more intuitively with coalitions data types (tuples of int or str) and also allows to add player_names to the game at initialization #183
  • improve tests across the package

Documentation

  • adds a notebook showing how to use custom tree models with the TreeExplainer #66
  • adds a notebook show how to use the shapiq.Game API to create custom games #184
  • adds a notebook showing hot to visualize interactions #252
  • adds a notebook showing how to compute Shapley values with shapiq #193
  • adds a notebook for conducting data valuation #190
  • adds a notebook showcasing introducing the Core and how to compute it with shapiq #191

Bug Fixes

  • fixes a bug with SIs not adding up to the model prediction because of wrong values in the empty set #264
  • fixes a bug that TreeExplainer did not have the correct baseline_value when using XGBoost models #250
  • fixes the force plot not showing and its baseline value

v1.0.1 (2024-06-05)

  • add max_order=1 to TabularExplainer and TreeExplainer
  • fix TreeExplainer.explain_X(..., n_jobs=2, random_state=0)

v1.0.0 (2024-06-04)

Major release of the shapiq Python package including (among others):

  • approximator module implements over 10 approximators of Shapley values and interaction indices.
  • exact module implements a computer for over 10 game theoretic concepts like interaction indices or generalized values.
  • games module implements over 10 application benchmarks for the approximators.
  • explainer module includes a TabularExplainer and TreeExplainer for any-order feature interactions of machine learning model predictions.
  • interaction_values module implements a data class to store and analyze interaction values.
  • plot module allows visualizing interaction values.
  • datasets module loads datasets for testing and examples.

Documentation of shapiq with tutorials and API reference is available at https://shapiq.readthedocs.io

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shapiq-1.1.1.tar.gz (167.5 kB view details)

Uploaded Source

Built Distribution

shapiq-1.1.1-py3-none-any.whl (213.0 kB view details)

Uploaded Python 3

File details

Details for the file shapiq-1.1.1.tar.gz.

File metadata

  • Download URL: shapiq-1.1.1.tar.gz
  • Upload date:
  • Size: 167.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for shapiq-1.1.1.tar.gz
Algorithm Hash digest
SHA256 c387ac43578f615fcdce78a483c9878687948935969b8b4467c95c454435cfe5
MD5 1f257c4cabae8c8a9e2261b739e5b23d
BLAKE2b-256 49f62ff07243e1e49385fb1ba9b7cd31bb44f8d8c1dddd01f1115c98b8f98faf

See more details on using hashes here.

File details

Details for the file shapiq-1.1.1-py3-none-any.whl.

File metadata

  • Download URL: shapiq-1.1.1-py3-none-any.whl
  • Upload date:
  • Size: 213.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for shapiq-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6db684a5b0d5739920856c8228a98ff05654418b3df89f5944c87be7b6a1e575
MD5 7f978d7f2d28f01ce8b07f66d1f6e866
BLAKE2b-256 f20f174147dd61241e478291006e66e96a512da32161659591fe4f52f3b4bc0a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page