Wrapper for shapley explanations
Project description
Shapsplain
A wrapper library for SHAP-value based explanations, suitable for several sorts of BigML models. For an explanation of Shapely-value based model explanation, there are academic papers on both tree-based and gradient-based algorithms.
SHAP-valued explanations are additive, in the sense that you start out with some "expected value" for a prediction (usually based on the prior distribution of the training data), and then each feature contributes some amount to "push" the prediction in one direction or another. As such, SHAP importance values can be positive or negative.
Tree-based Model Explanations
To construct a Shapely-value predictor for a BigML model, one can use
the ShapForest
class:
from shapsplain.forest import ShapForest
forest = ShapForest(model)
where model
is a dictionary containing the JSON model downloaded
from BigML. To make a prediction with the model, one can use the
predict
method:
forest.predict({'petal length': 4.2, 'sepal length': 0.2})
If this is an anomaly detector, for example, this will output a list with a single value:
[0.8785773]
To get an explanation with the prediction, pass the optional argument
explanation=True
.
forest.predict({'petal length': 4.2, 'sepal length': 0.2}, explanation=True)
The value returned from this call is a list, with one element per class for classification models, or a single element for regression models and anomaly detectors. In the inner lists, The first value is the prediction. Subsequent values are the importance factors for each feature in the model, ordered by absolute importance value.
[
[
0.8785772973680668,
["000003", 0.10580323276121423],
["000004", 0.12349988309753568],
["000001", 0.1452021495824386],
["000000", 0.034717729468588754],
["000002", 0.029713614812585054]
]
]
Subtracting all of these importances from the prediction will give a "baseline" score for the model on its training data. You can see from the importances above that the two provided values for the model are far less important than the missing values for "petal width", "sepal width", and "species", as there were no missing values in training.
In general, these importance values may be positive or negative, depending on their overall impact on the prediction. For example, let's generate an explanation for an "Iris" model prediction for one of the "Iris-setosa" instances.
forest = ShapForest(iris_classifier)
pt = {"petal length": 5.1, "petal width": 3.5, "sepal length": 1.4, "sepal width": 0.2}
forest.predict(pt, explanation=True)
Here, of course, the outermost list will have three elements, one for each class. Also, we see that in this case all features happen to contribute positively to the positively predicted class, but this is not true in general.
[
[
1.0,
['000002', 0.31709336248906567],
['000003', 0.29068868841929324],
['000000', 0.06062257935410617],
['000001', 0.0025287030708683912]
],
[
0.0,
['000002', -0.16150644430157587],
['000003', -0.15387190455071248],
['000000', -0.02328311608231396],
['000001', 0.004928131601268985]],
[
0.0,
['000002', -0.15558691818748985],
['000003', -0.13681678386858068],
['000000', -0.03733946327179222],
['000001', -0.0074568346721373725]
]
]
If one wishes to have a direction-agnostic notion of importance for the predicted class, the absolute value of the returned values may be normalized and still retain a useful semantics.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file shapsplain-0.3.0.tar.gz
.
File metadata
- Download URL: shapsplain-0.3.0.tar.gz
- Upload date:
- Size: 17.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e360358cc398e2cbba2acfb8f84d52b60bcd3e4f486c8308613906fc4ed363c8 |
|
MD5 | 1cb4d9b01ffdad073bbf210579960f10 |
|
BLAKE2b-256 | 8d1825d98cee981c6790f2cff0601551c0b3fbbf458e75bb18796d35ccc64636 |
File details
Details for the file shapsplain-0.3.0-py3-none-any.whl
.
File metadata
- Download URL: shapsplain-0.3.0-py3-none-any.whl
- Upload date:
- Size: 14.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 71b0685187f0721ec7a8deae062daa527e65063f3e897f09f307410f8ebdb5c6 |
|
MD5 | fccdb04cb132e9758729a29d0584b176 |
|
BLAKE2b-256 | bb22ad661e711d0229d38505b52110a5d707660b6b243595eff67fb83263d857 |