Skip to main content

ACV is a library that provides robust and accurate explanations for machine learning models or data

Project description

Active Coalition of Variables (ACV):

ACV is a python library that aims to explain any machine learning models or data.

  • It gives local rule-based explanations for any model or data.
  • It provides a better estimation of Shapley Values for tree-based model (more accurate than path-dependent TreeSHAP). It also proposes new Shapley Values that have better local fidelity.
    • and the correct way of computing Shapley Values of categorical variables after encoding (eg., One Hot or Dummy, etc.)

We can regroup the different explanations in two groups: Agnostic Explanations and Tree-based Explanations.

See the papers here.

Installation

Requirements

Python 3.7-3.9

OSX: ACV uses Cython extensions that need to be compiled with multi-threading support enabled. The default Apple Clang compiler does not support OpenMP. To solve this issue, obtain the lastest gcc version with Homebrew that has multi-threading enabled: see for example pysteps installation for OSX.

Windows: Install MinGW (a Windows distribution of gcc) or Microsoft’s Visual C

Install the acv package:

$ pip install acv-exp

A. Agnostic explanations

The Agnostic approaches explain any data (X, Y) or model (X, f(X)) using the following explanation methods:

  • Same Decision Probability (SDP) and Sufficient Explanations
  • Sufficient Rules

See the paper Consistent Sufficient Explanations and Minimal Local Rules for explaining regression and classification models for more details.

I. First, we need to fit our explainer (ACXplainers) to input-output of the data (X, Y) or model (X, f(X)) if we want to explain the data or the model respectively.

from acv_explainers import ACXplainer

# It has the same params as a Random Forest, and it should be tuned to maximize the performance.  
acv_xplainer = ACXplainer(classifier=True, n_estimators=50, max_depth=5)
acv_xplainer.fit(X_train, y_train)

roc = roc_auc_score(acv_xplainer.predict(X_test), y_test)

II. Then, we can load all the explanations in a webApp as follow:

import acv_app
import os

# compile the ACXplainer
acv_app.compile_ACXplainers(acv_xplainer, X_train, y_train, X_test, y_test, path=os.getcwd())

# Launch the webApp
acv_app.run_webapp(pickle_path=os.getcwd())

Capture d’écran de 2021-11-03 19-50-12

III. Or we can compute each explanation separately as follow:

Same Decision Probability (SDP)

The main tool of our explanations is the Same Decision Probability (SDP). Given , the same decision probability of variables is the probabilty that the prediction remains the same when we fixed variables or when the variables are missing.

  • How to compute ?
sdp = acv_xplainer.compute_sdp_rf(X, S, data_bground) # data_bground is the background dataset that is used for the estimation. It should be the training samples.

Minimal Sufficient Explanations

The Sufficient Explanations is the Minimal Subset S such that fixing the values permit to maintain the prediction with high probability . See the paper here for more details.

  • How to compute the Minimal Sufficient Explanation ?

    The following code return the Sufficient Explanation with minimal cardinality.

sdp_importance, min_sufficient_expl, size, sdp = acv_xplainer.importance_sdp_rf(X, y, X_train, y_train, pi_level=0.9)
  • How to compute all the Sufficient Explanations ?

    Since the Minimal Sufficient Explanation may not be unique for a given instance, we can compute all of them.

sufficient_expl, sdp_expl, sdp_global = acv_xplainer.sufficient_expl_rf(X, y, X_train, y_train, pi_level=0.9)

Local Explanatory Importance

For a given instance, the local explanatory importance of each variable corresponds to the frequency of apparition of the given variable in the Sufficient Explanations. See the paper here for more details.

  • How to compute the Local Explanatory Importance ?
lximp = acv_xplainer.compute_local_sdp(d=X_train.shape[1], sufficient_expl)

Local rule-based explanations

For a given instance (x, y) and its Sufficient Explanation S such that , we compute a local minimal rule which contains x such that every observation z that satisfies this rule has . See the paper here for more details

  • How to compute the local rule explanations ?
sdp, rules, _, _, _ = acv_xplainer.compute_sdp_maxrules(X, y, data_bground, y_bground, S) # data_bground is the background dataset that is used for the estimation. It should be the training samples.

B. Tree-based explanations

ACV gives Shapley Values explanations for XGBoost, LightGBM, CatBoostClassifier, scikit-learn and pyspark tree models. It provides the following Shapley Values:

  • Classic local Shapley Values (The value function is the conditional expectation )
  • Active Shapley values (Local fidelity and Sparse by design)
  • Swing Shapley Values (The Shapley values are interpretable by design) (Coming soon)

In addition, we use the coalitional version of SV to properly handle categorical variables in the computation of SV.

See the papers here

To explain the tree-based models above, we need to transform our model into ACVTree.

from acv_explainers import ACVTree

forest = XGBClassifier() # or any Tree Based models
#...trained the model

acvtree = ACVTree(forest, data_bground) # data_bground is the background dataset that is used for the estimation. It should be the training samples.

Accurate Shapley Values

sv = acvtree.shap_values(X)

Note that it provides a better estimation of the tree-path dependent of TreeSHAP when the variables are dependent.

Accurate Shapley Values with encoded categorical variables

Let assume we have a categorical variable Y with k modalities that we encoded by introducing the dummy variables . As shown in the paper, we must take the coalition of the dummy variables to correctly compute the Shapley values.

# cat_index := list[list[int]] that contains the column indices of the dummies or one-hot variables grouped 
# together for each variable. For example, if we have only 2 categorical variables Y, Z 
# transformed into [Y_0, Y_1, Y_2] and [Z_0, Z_1, Z_2]

cat_index = [[0, 1, 2], [3, 4, 5]]
forest_sv = acvtree.shap_values(X, C=cat_index)

In addition, we can compute the SV given any coalitions. For example, let assume we have 10 variables and we want the following coalition

coalition = [[0, 1, 2], [3, 4], [5, 6]]
forest_sv = acvtree.shap_values(X, C=coalition)

How to compute for tree-based classifier ?

Recall that the is the probability that the prediction remains the same when we fixed variables given the subset S.

sdp = acvtree.compute_sdp_clf(X, S, data_bground) # data_bground is the background dataset that is used for the estimation. It should be the training samples.

How to compute the Sufficient Coalition and the Global SDP importance for tree-based classifier ?

Recall that the Minimal Sufficient Explanations is the Minimal Subset S such that fixing the values permit to maintain the prediction with high probability .

sdp_importance, sdp_index, size, sdp = acvtree.importance_sdp_clf(X, data_bground) # data_bground is the background dataset that is used for the estimation. It should be the training samples.

Active Shapley values

The Active Shapley values is a SV based on a new game defined in the Paper (Accurate and robust Shapley Values for explaining predictions and focusing on local important variables such that null (non-important) variables has zero SV and the "payout" is fairly distribute among active variables.

  • How to compute Active Shapley values ?
import acv_explainers

# First, we need to compute the Active and Null coalition
sdp_importance, sdp_index, size, sdp = acvtree.importance_sdp_clf(X, data_bground)
S_star, N_star = acv_explainers.utils.get_active_null_coalition_list(sdp_index, size)

# Then, we used the active coalition found to compute the Active Shapley values.
forest_asv_adap = acvtree.shap_values_acv_adap(X, C, S_star, N_star, size)
Remarks for tree-based explanations:

If you don't want to use multi-threaded (due to scaling or memory problem), you have to add "_nopa" to each function (e.g. compute_sdp_clf ==> compute_sdp_clf_nopa). You can also compute the different values needed in cache by setting cache=True in ACVTree initialization e.g. ACVTree(model, data_bground, cache=True).

Examples and tutorials (a lot more to come...)

We can find a tutorial of the usages of ACV in demo_acv and the notebooks below demonstrate different use cases for ACV. Look inside the notebook directory of the repository if you want to try playing with the original notebooks yourself.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

acv-dev-0.0.10.tar.gz (1.3 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

acv_dev-0.0.10-cp39-cp39-win_amd64.whl (1.1 MB view details)

Uploaded CPython 3.9Windows x86-64

acv_dev-0.0.10-cp39-cp39-win32.whl (896.7 kB view details)

Uploaded CPython 3.9Windows x86

acv_dev-0.0.10-cp39-cp39-musllinux_1_1_x86_64.whl (8.8 MB view details)

Uploaded CPython 3.9musllinux: musl 1.1+ x86-64

acv_dev-0.0.10-cp39-cp39-musllinux_1_1_i686.whl (8.2 MB view details)

Uploaded CPython 3.9musllinux: musl 1.1+ i686

acv_dev-0.0.10-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

acv_dev-0.0.10-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl (7.3 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ i686

acv_dev-0.0.10-cp38-cp38-win_amd64.whl (1.1 MB view details)

Uploaded CPython 3.8Windows x86-64

acv_dev-0.0.10-cp38-cp38-win32.whl (902.6 kB view details)

Uploaded CPython 3.8Windows x86

acv_dev-0.0.10-cp38-cp38-musllinux_1_1_x86_64.whl (9.3 MB view details)

Uploaded CPython 3.8musllinux: musl 1.1+ x86-64

acv_dev-0.0.10-cp38-cp38-musllinux_1_1_i686.whl (8.7 MB view details)

Uploaded CPython 3.8musllinux: musl 1.1+ i686

acv_dev-0.0.10-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

acv_dev-0.0.10-cp38-cp38-manylinux_2_17_i686.manylinux2014_i686.whl (7.5 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ i686

acv_dev-0.0.10-cp37-cp37m-win_amd64.whl (1.1 MB view details)

Uploaded CPython 3.7mWindows x86-64

acv_dev-0.0.10-cp37-cp37m-win32.whl (873.7 kB view details)

Uploaded CPython 3.7mWindows x86

acv_dev-0.0.10-cp37-cp37m-musllinux_1_1_x86_64.whl (8.4 MB view details)

Uploaded CPython 3.7mmusllinux: musl 1.1+ x86-64

acv_dev-0.0.10-cp37-cp37m-musllinux_1_1_i686.whl (7.9 MB view details)

Uploaded CPython 3.7mmusllinux: musl 1.1+ i686

acv_dev-0.0.10-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.4 MB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.17+ x86-64

acv_dev-0.0.10-cp37-cp37m-manylinux_2_17_i686.manylinux2014_i686.whl (6.9 MB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.17+ i686

File details

Details for the file acv-dev-0.0.10.tar.gz.

File metadata

  • Download URL: acv-dev-0.0.10.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for acv-dev-0.0.10.tar.gz
Algorithm Hash digest
SHA256 9ef31e72e4241ff4b94fcac80f6bf58905b62490456a53ee7ba48d1fbbb5a9f5
MD5 d98784848cb91eea8f5edefcbff42abe
BLAKE2b-256 499f419345f673d453f76aff5304dd783f668a163fae282adc6b359d4225fae3

See more details on using hashes here.

File details

Details for the file acv_dev-0.0.10-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: acv_dev-0.0.10-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for acv_dev-0.0.10-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 86bc05f40d2d06c9408d871c2ad75285e8612475598206ccc86b0664fe035044
MD5 6d6cf534f4eb0516e7c53c64b09d3c5e
BLAKE2b-256 5e1058d9a5b61e069e56d62b4e9abf63987e158a2ef653c3f8947ad5ed717004

See more details on using hashes here.

File details

Details for the file acv_dev-0.0.10-cp39-cp39-win32.whl.

File metadata

  • Download URL: acv_dev-0.0.10-cp39-cp39-win32.whl
  • Upload date:
  • Size: 896.7 kB
  • Tags: CPython 3.9, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for acv_dev-0.0.10-cp39-cp39-win32.whl
Algorithm Hash digest
SHA256 54c538218d650f2190c26a47429f5e561c09093bfa48c0d9a35ab035d0b4fff7
MD5 a4f85aeedd8102e1602586ab3704b470
BLAKE2b-256 9023ac10d5fcea8a7eaeead588b5bae45169228fd70d36a1f8382edebb075e59

See more details on using hashes here.

File details

Details for the file acv_dev-0.0.10-cp39-cp39-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for acv_dev-0.0.10-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 bc3c9528cd373f4fcdacad0910e2ed5ec61a2ed64762d30d2c50bf246cb65144
MD5 0dd255482ac3f35aa315f74137ab1a50
BLAKE2b-256 d9ae0ebc9e9f5bc4f38a6cc99a0974b48b71633698b40cc946eccde774ec8790

See more details on using hashes here.

File details

Details for the file acv_dev-0.0.10-cp39-cp39-musllinux_1_1_i686.whl.

File metadata

File hashes

Hashes for acv_dev-0.0.10-cp39-cp39-musllinux_1_1_i686.whl
Algorithm Hash digest
SHA256 16ed01cd39235c4ed741e1f97ec690b5932b3ae79e44ecefb419bb91ade6dce8
MD5 a888e504b99d9f71f50ede1147c56104
BLAKE2b-256 4100367775f8d08fe1b5aa4fcad6d0851368ce4949c84e1e631e59a0dc82ec87

See more details on using hashes here.

File details

Details for the file acv_dev-0.0.10-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for acv_dev-0.0.10-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8700957dec4145af26ab03fdd3042b305b5a4499b2834fb3f987bf2e0912a064
MD5 b2b0f3e8d7e5353338b7f63cb6773da0
BLAKE2b-256 f647d96758d19ab62ef7bf2b10096ebb17a0c7204583dcc92c603889732ad8ee

See more details on using hashes here.

File details

Details for the file acv_dev-0.0.10-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for acv_dev-0.0.10-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 6453002fab3a82fc23c8cfcb257132773af188a177533b63043b40c9c706b092
MD5 5a3b371d8264aea4c7a41434592b98a1
BLAKE2b-256 3766979763b6270607415dece6f600e10b3a4d059ab5682308fa03582bc6c826

See more details on using hashes here.

File details

Details for the file acv_dev-0.0.10-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: acv_dev-0.0.10-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for acv_dev-0.0.10-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 5a6f696ec521bb4d96ed898bc09fb3e1f1152c2181a09e64172428f13f0fd5af
MD5 7cd5d85e8c53f7a52fb8f668b25721a3
BLAKE2b-256 48c3e9646195a940827fd662aa518afb9663d3a1e9afd915c73448db2f311090

See more details on using hashes here.

File details

Details for the file acv_dev-0.0.10-cp38-cp38-win32.whl.

File metadata

  • Download URL: acv_dev-0.0.10-cp38-cp38-win32.whl
  • Upload date:
  • Size: 902.6 kB
  • Tags: CPython 3.8, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for acv_dev-0.0.10-cp38-cp38-win32.whl
Algorithm Hash digest
SHA256 70e100c2410d879e26e2bdd62980a710ba71fe015cb25e591397e1e4b6b86ff3
MD5 5f59ae5f3a6be9225deb0083184c571d
BLAKE2b-256 f848275eff288ea4017e07b5ad73a8a644af3f14b0169f374e421af1a60a3e41

See more details on using hashes here.

File details

Details for the file acv_dev-0.0.10-cp38-cp38-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for acv_dev-0.0.10-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 cc1e466e6907a7b003774dd662d0a491d679418a8a6694d6fd73a0b6f7daf2a6
MD5 78ddfb78f53e97deec7adc287f573887
BLAKE2b-256 ee786dc7a88e0fc4f657f5c74ccaa425503f27108b56bff9ed94937dd41341ae

See more details on using hashes here.

File details

Details for the file acv_dev-0.0.10-cp38-cp38-musllinux_1_1_i686.whl.

File metadata

File hashes

Hashes for acv_dev-0.0.10-cp38-cp38-musllinux_1_1_i686.whl
Algorithm Hash digest
SHA256 0ba3a61d141c8c5241d1df265bef3a956ba69cb6fd414b323fd5fa36928001cf
MD5 a18da56c0f2ff2a8bd7f57eb5ccb7042
BLAKE2b-256 154340549c582b7b3141715d5112d9f54597414c84e8dc26300f3867c0b66501

See more details on using hashes here.

File details

Details for the file acv_dev-0.0.10-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for acv_dev-0.0.10-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 492ecebc365856c3313309741dcfeaedf138fed0be14681a5b1ca3b0555eed24
MD5 97f6ed272de92081c74874477a5d24fd
BLAKE2b-256 d0323bf77f2b3da6c3f0b18d6ffbff7f6ac532cf2ca77f58887d839d6ecd3dbe

See more details on using hashes here.

File details

Details for the file acv_dev-0.0.10-cp38-cp38-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for acv_dev-0.0.10-cp38-cp38-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 1bacf15b95282ef8f820af03c682ed69032c5ea0e01fb824092b50887c9cf687
MD5 75112f9eb252579888d16ebe391d92a3
BLAKE2b-256 12802d9813aac1445f8fda65968c3ca5bf48a932b10b40ab91a1494e0e33aa4a

See more details on using hashes here.

File details

Details for the file acv_dev-0.0.10-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: acv_dev-0.0.10-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for acv_dev-0.0.10-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 945174a08bd9fdc1a2c6d7e38e0243734a451eb24fd29addefe09c6650861a40
MD5 9d8b0f4ea37e3a017d132d67f8c41fbc
BLAKE2b-256 457c94a030baec806366967b41eaa798962208e644af71ccbff2c8c35c929f27

See more details on using hashes here.

File details

Details for the file acv_dev-0.0.10-cp37-cp37m-win32.whl.

File metadata

  • Download URL: acv_dev-0.0.10-cp37-cp37m-win32.whl
  • Upload date:
  • Size: 873.7 kB
  • Tags: CPython 3.7m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for acv_dev-0.0.10-cp37-cp37m-win32.whl
Algorithm Hash digest
SHA256 4ca3ba2d1dd570d196c5f098ec21ac51193d34b8c18455da9f369af930c51693
MD5 e4bf105cd157e5f2a755f2deef3c30ad
BLAKE2b-256 10135f843d75f648362fde2479103eb8b6fc3684c791427924d762068251ff3e

See more details on using hashes here.

File details

Details for the file acv_dev-0.0.10-cp37-cp37m-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for acv_dev-0.0.10-cp37-cp37m-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 9d61c855f883fe4d2eec1388b82a329280890faaa2a39e55c6a008ceeeb0d222
MD5 ec4e98e3118449ff7f9c7d955878c849
BLAKE2b-256 56b4fa22ff19d026ace8a578751bfcdcbafffc5929aa25d9e2edfe516d3f4ce5

See more details on using hashes here.

File details

Details for the file acv_dev-0.0.10-cp37-cp37m-musllinux_1_1_i686.whl.

File metadata

File hashes

Hashes for acv_dev-0.0.10-cp37-cp37m-musllinux_1_1_i686.whl
Algorithm Hash digest
SHA256 db4723e1874a9197ab6842be216da2c8b014970419d8eba6065dfe8dcd845911
MD5 ef4484fc1031ec71dfac5ea4c7de1ae9
BLAKE2b-256 12c953092e4ab72ed7db341d3e6af58ac67855109d062f7f9b347ef90e625a20

See more details on using hashes here.

File details

Details for the file acv_dev-0.0.10-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for acv_dev-0.0.10-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d06bddefb85ba046572a483f139940723d71ab1b4a20291130f5bf511079c50d
MD5 266aafa73891395eaa1010606d18eb94
BLAKE2b-256 d84a159843a1444cf7efbac6d8db7883709110e5282e3dce686e9952f9190a55

See more details on using hashes here.

File details

Details for the file acv_dev-0.0.10-cp37-cp37m-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for acv_dev-0.0.10-cp37-cp37m-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 e66b12ed342717fec6282675456cb88710d4b50b45bbd15b33f91a56dd8e4116
MD5 e49c9cfe47f360dbfb3fce8fd42d6bae
BLAKE2b-256 878b43d46bc772b0a36a5081f35b15fa12aabe18b457a9f0c30512212b6f4fb2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page