Explaining the predictions of any machine learning model
Project description
LEMON is a technique to explain why predictions of machine learning models are made. It does so by providing feature contribution: a score for each feature that indicates how much it contributed to the final prediction. More precisely, it shows the sensitivity of the feature: a small change in an important feature's value results in a relatively large change in prediction. It is similar to the popular LIME explanation technique, but is more faithful to the reference model, especially for larger datasets.
Installation
To install use pip:
$ pip install lemon-explainer
Example
A minimal working example is shown below:
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from lemon import LemonExplainer
# Load dataset
data = load_iris(as_frame=True)
X = data.data
y = pd.Series(np.array(data.target_names)[data.target])
# Train complex model
clf = RandomForestClassifier()
clf.fit(X, y)
# Explain instance
explainer = LemonExplainer(X, radius_max=0.5)
instance = X.iloc[-1, :]
explanation = explainer.explain_instance(instance, clf.predict_proba)[0]
explanation.show_in_notebook()
Development
For a development installation (requires npm or yarn),
$ git clone https://github.com/iamDecode/lemon.git
$ cd lemon
You may want to (create and) activate a virtual environment:
$ python3 -m venv venv
$ source venv/bin/activate
Install requirements:
$ pip install -r requirements.txt
And run the tests with:
$ pytest .
Approximate distance kernel LIME
If you prefer to use a Gaussian distance kernel as used in LIME, we can approximate this behavior with:
from lemon import LemonExplainer, gaussian_kernel
from scipy.special import gammainccinv
DIMENSIONS = X.shape[1]
KERNEL_SIZE = np.sqrt(DIMENSIONS) * .75 # kernel size as used in LIME
# Obtain a distance kernel very close to LIME's gaussian kernel, see the paper for details.
p = 0.999
radius = KERNEL_SIZE * np.sqrt(2 * gammainccinv(DIMENSIONS / 2, (1 - p)))
kernel = lambda x: gaussian_kernel(x, KERNEL_SIZE)
explainer = LemonExplainer(X, distance_kernel=kernel, radius_max=radius)
This behavior is as close as possible to LIME, but still yields more faithful explanations due to LEMON's improved sampling technique. Read the paper for more details about this approach.
Citation
If you want to refer to our explanation technique, please cite our paper using the following BibTeX entry:
@inproceedings{collaris2023lemon,
title={{LEMON}: Alternative Sampling for More Faithful Explanation Through Local Surrogate Models},
author={Collaris, Dennis and Gajane, Pratik and Jorritsma, Joost and van Wijk, Jarke J and Pechenizkiy, Mykola},
booktitle={Advances in Intelligent Data Analysis XXI: 21st International Symposium on Intelligent Data Analysis (IDA 2023)},
pages={77--90},
year={2023},
organization={Springer}
}
License
This project is licensed under the BSD 2-Clause License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file lemon_explainer-0.0.3.tar.gz
.
File metadata
- Download URL: lemon_explainer-0.0.3.tar.gz
- Upload date:
- Size: 53.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 440e8aa207471f85ede7721ac23d1dec4f0922d012bf78bd9993562ad9570db7 |
|
MD5 | 9884911cd5dd0d598b970f5ce9280b29 |
|
BLAKE2b-256 | 2252bb7c5c4c8a8d449dc9f330a6c7d36e1c7f38ced8930ff6ead9c34e1ed799 |
File details
Details for the file lemon_explainer-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: lemon_explainer-0.0.3-py3-none-any.whl
- Upload date:
- Size: 9.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 38d48401909b8862dbc6f65bc85d799e1f667696d29c8671c66db4c6a650134e |
|
MD5 | a2f5bc1b69c3b615e1f33b62e89d48e5 |
|
BLAKE2b-256 | da7d3703bf3392d40dc28c628e0dd98526667f9d4cb14abd3bb5630732fb69ed |