Skip to main content

Wrapper for shapley explanations

Project description

Shapsplain

A wrapper library for SHAP-value based explanations, suitable for several sorts of BigML models. For an explanation of Shapely-value based model explanation, there are academic papers on both tree-based and gradient-based algorithms.

SHAP-valued explanations are additive, in the sense that you start out with some "expected value" for a prediction (usually based on the prior distribution of the training data), and then each feature contributes some amount to "push" the prediction in one direction or another. As such, SHAP importance values can be positive or negative.

Tree-based Model Explanations

To construct a Shapely-value predictor for a BigML model, one can use the ShapForest class:

from shapsplain.forest import ShapForest

forest = ShapForest(model)

where model is a dictionary containing the JSON model downloaded from BigML. To make a prediction with the model, one can use the predict method:

forest.predict({'petal length': 4.2, 'sepal length': 0.2})

If this is an anomaly detector, for example, this will output a list with a single value:

[0.8785773]

To get an explanation with the prediction, pass the optional argument explanation=True.

forest.predict({'petal length': 4.2, 'sepal length': 0.2}, explanation=True)

The value returned from this call is a list, with one element per class for classification models, or a single element for regression models and anomaly detectors. In the inner lists, The first value is the prediction. Subsequent values are the importance factors for each feature in the model, ordered by absolute importance value.

[
    [
        0.8785772973680668,
        ["000003", 0.10580323276121423],
        ["000004", 0.12349988309753568],
        ["000001", 0.1452021495824386],
        ["000000", 0.034717729468588754],
        ["000002", 0.029713614812585054]
    ]
]

Subtracting all of these importances from the prediction will give a "baseline" score for the model on its training data. You can see from the importances above that the two provided values for the model are far less important than the missing values for "petal width", "sepal width", and "species", as there were no missing values in training.

In general, these importance values may be positive or negative, depending on their overall impact on the prediction. For example, let's generate an explanation for an "Iris" model prediction for one of the "Iris-setosa" instances.

forest = ShapForest(iris_classifier)
pt = {"petal length": 5.1, "petal width": 3.5, "sepal length": 1.4, "sepal width": 0.2}
forest.predict(pt, explanation=True)

Here, of course, the outermost list will have three elements, one for each class. Also, we see that in this case all features happen to contribute positively to the positively predicted class, but this is not true in general.

[
    [
        1.0,
        ['000002', 0.31709336248906567],
        ['000003', 0.29068868841929324],
        ['000000', 0.06062257935410617],
        ['000001', 0.0025287030708683912]
    ],
    [
        0.0,
        ['000002', -0.16150644430157587],
        ['000003', -0.15387190455071248],
        ['000000', -0.02328311608231396],
        ['000001', 0.004928131601268985]],
    [
        0.0,
        ['000002', -0.15558691818748985],
        ['000003', -0.13681678386858068],
        ['000000', -0.03733946327179222],
        ['000001', -0.0074568346721373725]
    ]
]

If one wishes to have a direction-agnostic notion of importance for the predicted class, the absolute value of the returned values may be normalized and still retain a useful semantics.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shapsplain-0.3.0.tar.gz (17.7 kB view details)

Uploaded Source

Built Distribution

shapsplain-0.3.0-py3-none-any.whl (14.7 kB view details)

Uploaded Python 3

File details

Details for the file shapsplain-0.3.0.tar.gz.

File metadata

  • Download URL: shapsplain-0.3.0.tar.gz
  • Upload date:
  • Size: 17.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for shapsplain-0.3.0.tar.gz
Algorithm Hash digest
SHA256 e360358cc398e2cbba2acfb8f84d52b60bcd3e4f486c8308613906fc4ed363c8
MD5 1cb4d9b01ffdad073bbf210579960f10
BLAKE2b-256 8d1825d98cee981c6790f2cff0601551c0b3fbbf458e75bb18796d35ccc64636

See more details on using hashes here.

File details

Details for the file shapsplain-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: shapsplain-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 14.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for shapsplain-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 71b0685187f0721ec7a8deae062daa527e65063f3e897f09f307410f8ebdb5c6
MD5 fccdb04cb132e9758729a29d0584b176
BLAKE2b-256 bb22ad661e711d0229d38505b52110a5d707660b6b243595eff67fb83263d857

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page