Skip to main content

Model Error Analysis python package

Project description

Model Error Analysis Workflow

Introduction

mealy is a Python package to perform Model Error Analysis of scikit-learn models leveraging a Model Performance Predictor.

The project is currently maintained by Dataiku's research team. This is an alpha version.

Getting started

MEA documentation features some examples helping you getting started with Model Error Analysis:

Model Error Analysis

After training a ML model, data scientists need to investigate the model failures to build intuition on the critical sub-populations on which the model is performing poorly. This analysis is essential in the iterative process of model design and feature engineering and is usually performed manually.

The mealy package streamlines the analysis of the samples mostly contributing to model errors and provides the user with automatic tools to break down the model errors into meaningful groups, easier to analyze, and to highlight the most frequent type of errors, as well as the problematic features correlated with the failures.

We call the model under investigation the primary model.

This approach relies on a Model Performance Predictor (MPP), a secondary model trained to predict whether the primary model prediction is correct or wrong, i.e. a success or a failure. More precisely, the MPP is a binary DecisionTree classifier predicting whether the primary model will yield a Correct Prediction or a Wrong Prediction.

The MPP can be trained on any dataset meant to evaluate the primary model performances, thus containing ground truth labels. In particular the provided primary test set is split into a secondary training set to train the MPP and a secondary test set to compute the MPP metrics.

In classification tasks the model failure is a wrong predicted class, whereas in case of regression tasks the failure is defined as a large deviation of the predicted value from the true one. In the latter case, when the absolute difference between the predicted and the true value is higher than a threshold ε, the model outcome is considered as a Wrong Prediction. The threshold ε is computed as the knee point of the Regression Error Characteristic (REC) curve, ensuring the absolute error of primary predictions to be within tolerable limits.

The leaves of the MPP decision tree break down the test dataset into smaller segments with similar features and similar model performances. Analyzing the sub-population in the error leaves, and comparing with the global population, provides insights about critical features correlated with the model failures.

The mealy package leads the user to focus on what are the problematic features and what are the typical values of these features for the mis-predicted samples. This information can later be exploited to support the strategy selected by the user :

  • improve model design: removing a problematic feature, removing samples likely to be mislabeled, ensemble with a model trained on a problematic subpopulation, ...
  • enhance data collection: gather more data regarding the most erroneous under-represented populations,
  • select critical samples for manual inspection thanks to the MPP and avoid primary predictions on them, generating model assertions.

The typical workflow in the iterative model design supported by error analysis is illustrated in the figure below.

Model Error Analysis Workflow

Getting started with mealy

Let (X_train, y_train) be the training data of the model to analyze, and (X_test, y_test) its test set. The Model Error Analysis can be performed as:

from mealy.error_analyzer import ErrorAnalyzer
from mealy.error_visualizer import ErrorVisualizer

# train any scikit-learn model
model = RandomForestClassifier()
model.fit(X_train, y_train)

# fit a Model Performance Predictor on the model performances
error_analyzer = ErrorAnalyzer(model, feature_names=feature_names)
error_analyzer.fit(X_test, y_test)

# print metrics regarding the Model Performance Predictor
print(error_analyzer.mpp_summary(X_test, y_test, output_dict=False))

# plot the Model Performance Predictor Decision Tree
error_visualizer = ErrorVisualizer(error_analyzer)
error_visualizer.plot_error_tree()

# print the details regarding the decision tree nodes containing the majority of errors
error_analyzer.error_node_summary(leaf_selector="all_errors", add_path_to_leaves=True, print_summary=True);
# plot the feature distributions of samples in the nodes containing the majority of errors
# rank features by correlation to error
error_visualizer.plot_feature_distributions_on_leaves(leaf_selector="all_errors", top_k_features=3)

Using mealy with pipeline to undo feature pre-processing

Let (X_train, y_train) be the training data of the model to analyze, and (X_test, y_test) its test set. The numeric features numerical_feature_names are for instance pre-processed by a simple imputer and standard scaler, while the categorical features categorical_feature_names are one-hot encoded. The full pre-processing is provided to a Pipeline object in the form of a scikit-learn column transformer. The last step of the pipeline is the model to analyze.

Among the transfomers available in sklearn.preprocessing KBinDiscretizer and PolynomialFeatures are currently not supported.

The Model Error Analysis can be performed as:

from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import make_column_transformer
from sklearn.pipeline import make_pipeline
from sklearn.impute import SimpleImputer

from mealy.error_analyzer import ErrorAnalyzer
from mealy.error_visualizer import ErrorVisualizer

transformers = [
    (make_pipeline(SimpleImputer(), StandardScaler()), numerical_feature_names),
    (OneHotEncoder(handle_unknown='ignore'), categorical_feature_names)
]

preprocess = make_column_transformer(
    *transformers
)

pipeline_model = make_pipeline(
    preprocess,
    RandomForestClassifier())

# train a pipeline model

pipeline_model.fit(X_train, y_train)

# fit a Model Performance Predictor on the model performances
error_analyzer = ErrorAnalyzer(pipeline_model, feature_names=feature_names)
error_analyzer.fit(X_test, y_test)

# print metrics regarding the Model Performance Predictor
print(error_analyzer.mpp_summary(X_test, y_test, output_dict=False))

# plot the Model Performance Predictor Decision Tree
error_visualizer = ErrorVisualizer(error_analyzer)
error_visualizer.plot_error_tree()

# print the details regarding the decision tree nodes containing the majority of errors
error_analyzer.error_node_summary(leaf_selector="all_errors", add_path_to_leaves=True, print_summary=True);
# plot the feature distributions of samples in the nodes containing the majority of errors
# rank features by correlation to error
error_visualizer.plot_feature_distributions_on_leaves(leaf_selector="all_errors", top_k_features=3)

Installation

Dependencies

mealy depends on:

  • Python >= 3.5
  • NumPy >= 1.11
  • SciPy >= 0.19
  • scikit-learn >= 0.19
  • matplotlib >= 2.0
  • graphviz >= 0.14
  • pydotplus >= 2.0
  • kneed == 0.6

Installing with pip

The easiest way to install mealy is to use pip. For a vanilla install, simply type:

pip install -U mealy

Contributing

Contributions are welcome. Check out our contributing guidelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mealy-0.1.0.tar.gz (23.7 kB view details)

Uploaded Source

Built Distribution

mealy-0.1.0-py3-none-any.whl (28.1 kB view details)

Uploaded Python 3

File details

Details for the file mealy-0.1.0.tar.gz.

File metadata

  • Download URL: mealy-0.1.0.tar.gz
  • Upload date:
  • Size: 23.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.7.7

File hashes

Hashes for mealy-0.1.0.tar.gz
Algorithm Hash digest
SHA256 698a683d6a6412b644233e35ae15b2d5508af29eed90d25b3f6f14cbb021e8a8
MD5 3d289219266d1d95b560d025122e0783
BLAKE2b-256 c688a24022abf1bd90090a5dd0064fe24eefdd3df0ffaad0267514babda6c9e2

See more details on using hashes here.

File details

Details for the file mealy-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mealy-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 28.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.7.7

File hashes

Hashes for mealy-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2b22e91fa3216efd4aa00ef5f9a22bb7cdaa025eeab57fe18ce9695bd7225f60
MD5 ff02f91c0ed33628003021bcd44dee74
BLAKE2b-256 e378522dc76a682dc97d5a1200006e6a14493404b560e5c42a0bda7f247e47a1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page