Skip to main content

A technique in explainable AI for answering broader questions in machine learning.

Project description

Generalized Shapley Additive Explanations (G-SHAP) is a technique in explainable AI for answering broad questions in machine learning.

Applications

General classification and regression

Suppose we have a black-box model which diagnoses patients with COVID-19, the flu, or a common cold based on their symptoms. Existing explanatory methods can tell us why our model diagnosed a patient with COVID-19. G-SHAP can answer broader questions, such as how do the symptoms which distinguish COVID-19 from the flu differ from those which distinguish COVID-19 from the common cold?.

Full analysis here.

Intergroup differences

Suppose we have a black-box model which predicts a criminal’s risk of recidivism to determine whether they are eligible for parole. Existing explanatory methods can tell us why our model predicted that a criminal has a high recidivism risk. G-SHAP can answer broader questions, such as why does our model predict that Black criminals have higher recidivism rates than White criminals?.

Full analysis here.

Model performance and failure

Suppose we have a black-box model which forecasts GDP growth based on macroeconomic variables. Existing explanatory methods can tell us why our model forecast 3% GDP growth in a given year. G-SHAP can answer broader questions, such as why did our model fail to forecast the 2008-2009 financial crisis?.

Full analysis here.

Installation

$ pip install gshap

Quickstart

Here we train a support vector classifier to predict whether a criminal will recidivate within two years of release from prison. We use G-SHAP to ask why our model predicts that Black criminals are more likely to recidivate than non-Black criminals.

import gshap
from gshap.datasets import load_recidivism
from gshap.intergroup import IntergroupDifference

from sklearn.svm import SVC

recidivism = load_recidivism()
X, y = recidivism.data, recidivism.target
clf = SVC().fit(X, y)

g = IntergroupDifference(group=X['black'], distance='relative_mean_distance')
explainer = gshap.KernelExplainer(clf.predict, X, g)
explainer.gshap_values(X, nsamples=10)

Out:

array([ 0.01335252,  0.24884556,  0.00132373, -0.0025238 , -0.00151837,
    0.40453822,  0.01636782,  0.07666043, -0.00056414,  0.00966583])

The sum of the G-SHAP values is the relative difference in predicted recidivism rates. The model predicts that Black criminals are 75% more likely to recidivate.

The variables most responsible for this difference are number of prior convictions (index 5; 40%), age (index 1; 25%), and race (index 7; 8%).

Citation

@software{bowen2020gshap,
  author = {Dillon Bowen},
  title = {Generalized Shapley Additive Explanations},
  url = {https://dsbowen.github.io/gshap/},
  date = {2020-05-19},
}

License

Users must cite G-SHAP in any publications which use this software.

G-SHAP is licensed with the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gshap-0.0.3.tar.gz (34.6 kB view details)

Uploaded Source

Built Distribution

gshap-0.0.3-py3-none-any.whl (37.1 kB view details)

Uploaded Python 3

File details

Details for the file gshap-0.0.3.tar.gz.

File metadata

  • Download URL: gshap-0.0.3.tar.gz
  • Upload date:
  • Size: 34.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.3.1 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.6.8

File hashes

Hashes for gshap-0.0.3.tar.gz
Algorithm Hash digest
SHA256 34ed62903521a9c97a5f13c52030306c11e675e63ce2dfb2192c921714e868e0
MD5 7c8b6c516c19010ae64b14a6581dba48
BLAKE2b-256 ef68e91b9ed002ebdc870e8e960de96534f867434c715d2ca25e162b53378c79

See more details on using hashes here.

File details

Details for the file gshap-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: gshap-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 37.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.3.1 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.6.8

File hashes

Hashes for gshap-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 682e9f314fd384c57d5ee1926600d9dafc0ac298e46dc4ba4bf151075daaa657
MD5 fafa13ed0889b4991daf9a49182646ee
BLAKE2b-256 aef36b3b476bd5e00539e53aec2406cadd9d5d3c080d3f78377417edf4056a6a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page