The SLISE algorithm for robust regression and explanations of black box models
Project description
SLISE - Sparse Linear Subset Explanations
Python implementation of the SLISE algorithm. The SLISE algorithm can be used for both robust regression and to explain outcomes from black box models. For more details see the original paper or the robust regression paper. Alternatively for a more informal overview see the presentation, or the poster.
Björklund A., Henelius A., Oikarinen E., Kallonen K., Puolamäki K. (2019)
Sparse Robust Regression for Explaining Classifiers.
Discovery Science (DS 2019).
Lecture Notes in Computer Science, vol 11828, Springer.
https://doi.org/10.1007/978-3-030-33778-0_27
Björklund A., Henelius A., Oikarinen E., Kallonen K., Puolamäki K. (2022).
Robust regression via error tolerance.
Data Mining and Knowledge Discovery.
https://doi.org/10.1007/s10618-022-00819-2
The idea
In robust regression we fit regression models that can handle data that contains outliers (see the example below for why outliers are problematic for normal regression). SLISE accomplishes this by fitting a model such that the largest possible subset of the data items have an error less than a given value. All items with an error larger than that are considered potential outliers and do not affect the resulting model.
SLISE can also be used to provide local model-agnostic explanations for outcomes from black box models. To do this we replace the ground truth response vector with the predictions from the complex model. Furthermore, we force the model to fit a selected item (making the explanation local). This gives us a local approximation of the complex model with a simpler linear model (this is similar to, e.g., LIME and SHAP). In contrast to other methods SLISE creates explanations using real data (not some discretised and randomly sampled data) so we can be sure that all inputs are valid (i.e. in the correct data manifold, and follows the constraints used to generate the data, e.g., the laws of physics).
Installation
To install this package just run:
pip install slise
Or install the latest version directly from GitHub with:
pip install https://github.com/edahelsinki/pyslise
Alternatively you can download the repo and run python -m build
to build a wheel, or pip install .
to install it locally.
Other Languages
The (original) R implementation can be found here.
Examples
Here are two quick examples of SLISE in action. For more detailed examples, with descriptions on how to create and interpret them, see the examples directory.
SLISE is a robust regression algorithm, which means that it is able to handle outliers. This is in contrast to, e.g., ordinary least-squares regression, which gives skewed results when outliers are present.
SLISE can also be used to explain outcomes from black box models by locally approximating the complex models with a simpler linear model.
Dependencies
This implementation requires Python 3 and the following packages:
- matplotlib
- numba
- numpy
- PyLBFGS
- scipy
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file slise-2.1.0.tar.gz
.
File metadata
- Download URL: slise-2.1.0.tar.gz
- Upload date:
- Size: 24.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 69555e7383a2f8827ce759d6d30b3a1e85ba59b673298cee7f7b785a2073e8de |
|
MD5 | 86695d32d9bd3a6909711368ad0676ed |
|
BLAKE2b-256 | 376ffce258d47ac23e43628d7d9da9555101858b861e2aad0a611ee85420355a |
File details
Details for the file slise-2.1.0-py3-none-any.whl
.
File metadata
- Download URL: slise-2.1.0-py3-none-any.whl
- Upload date:
- Size: 26.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 03206462cd8973a03d0ff9d2e345f2a4153e576b377bafd8fc27168aecb050da |
|
MD5 | bd970d95e08d7d4de87b6a77c1bd1f53 |
|
BLAKE2b-256 | 948009b8ef69fc4caf73878aa63ced61b5abb7e9f490615147a38246dc448698 |