Skip to main content

Explaining the predictions of any machine learning model

Project description

PyPI version

LEMON is a technique to explain why predictions of machine learning models are made. It does so by providing feature contribution: a score for each feature that indicates how much it contributed to the final prediction. More precisely, it shows the sensitivity of the feature: a small change in an important feature's value results in a relatively large change in prediction. It is similar to the popular LIME explanation technique, but is more faithful to the reference model, especially for larger datasets.

Website ↗ Academic paper ↗

Installation

To install use pip:

$ pip install lemon-explainer

Example

A minimal working example is shown below:

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from lemon import LemonExplainer

# Load dataset
data = load_iris(as_frame=True)
X = data.data
y = pd.Series(np.array(data.target_names)[data.target])

# Train complex model
clf = RandomForestClassifier()
clf.fit(X, y)

# Explain instance
explainer = LemonExplainer(X, radius_max=0.5)
instance = X.iloc[-1, :]
explanation = explainer.explain_instance(instance, clf.predict_proba)[0]
explanation.show_in_notebook()

Development

For a development installation (requires npm or yarn),

$ git clone https://github.com/iamDecode/lemon.git
$ cd lemon

You may want to (create and) activate a virtual environment:

$ python3 -m venv venv
$ source venv/bin/activate

Install requirements:

$ pip install -r requirements.txt

And run the tests with:

$ pytest .

Approximate distance kernel LIME

If you prefer to use a Gaussian distance kernel as used in LIME, we can approximate this behavior with:

from lemon import LemonExplainer, gaussian_kernel
from scipy.special import gammainccinv

DIMENSIONS = X.shape[1]
KERNEL_SIZE = np.sqrt(DIMENSIONS) * .75  # kernel size as used in LIME

# Obtain a distance kernel very close to LIME's gaussian kernel, see the paper for details.
p = 0.999
radius = KERNEL_SIZE * np.sqrt(2 * gammainccinv(DIMENSIONS / 2, (1 - p)))
kernel = lambda x: gaussian_kernel(x, KERNEL_SIZE)

explainer = LemonExplainer(X, distance_kernel=kernel, radius_max=radius)

This behavior is as close as possible to LIME, but still yields more faithful explanations due to LEMON's improved sampling technique. Read the paper for more details about this approach.

Citation

If you want to refer to our explanation technique, please cite our paper using the following BibTeX entry:

@inproceedings{collaris2023lemon,
  title={{LEMON}: Alternative Sampling for More Faithful Explanation Through Local Surrogate Models},
  author={Collaris, Dennis and Gajane, Pratik and Jorritsma, Joost and van Wijk, Jarke J and Pechenizkiy, Mykola},
  booktitle={Advances in Intelligent Data Analysis XXI: 21st International Symposium on Intelligent Data Analysis (IDA 2023)},
  pages={77--90},
  year={2023},
  organization={Springer}
}

License

This project is licensed under the BSD 2-Clause License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lemon_explainer-0.0.3.tar.gz (53.1 kB view details)

Uploaded Source

Built Distribution

lemon_explainer-0.0.3-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file lemon_explainer-0.0.3.tar.gz.

File metadata

  • Download URL: lemon_explainer-0.0.3.tar.gz
  • Upload date:
  • Size: 53.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.8

File hashes

Hashes for lemon_explainer-0.0.3.tar.gz
Algorithm Hash digest
SHA256 440e8aa207471f85ede7721ac23d1dec4f0922d012bf78bd9993562ad9570db7
MD5 9884911cd5dd0d598b970f5ce9280b29
BLAKE2b-256 2252bb7c5c4c8a8d449dc9f330a6c7d36e1c7f38ced8930ff6ead9c34e1ed799

See more details on using hashes here.

File details

Details for the file lemon_explainer-0.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for lemon_explainer-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 38d48401909b8862dbc6f65bc85d799e1f667696d29c8671c66db4c6a650134e
MD5 a2f5bc1b69c3b615e1f33b62e89d48e5
BLAKE2b-256 da7d3703bf3392d40dc28c628e0dd98526667f9d4cb14abd3bb5630732fb69ed

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page