Skip to main content

Uncertainty quantification and model inference for machine learning models

Project description

Version badge Python badge License badge Format badge Linting badge Test badge tests

ML Uncertainty

ML Uncertainty is a Python package which provides a scikit-learn-like interface to obtain prediction intervals and model parameter estimation for machine learning models in less than 4 lines of code.

It is build on top of scikit-learn and autograd packages, and is distributed under the MIT license.

This package has been built by Archit Datar (architdatar@gmail.com).

Getting started

Install from PyPI with

pip install ml-uncertainty

Intended audience

This package is intended to benefit data scientists and ML enthusiasts.

Motivation

  • Too often in machine learning, we fit complex models, but cannot quantity their precision via prediction intervals or feature significance.

  • This is especially true of the scikit-learn environment which is extremely easy to use but does not offer these functionalities.

  • However, in many use cases, especially where we have small and fat datasets, these are insights are critical to produce reliable models and insights.

  • Enter ML Uncertainty! This provides an easy API to get all these insights from models.

  • It takes scikit-learn fitted models as inputs and uses appropriate statistics to quantify the uncertainties in ML models.

Computing stats as easy as:

# Set up the model inference.
inf = ParametricModelInference()
inf.set_up_model_inference(X_train=X, y_train=y, estimator=regr)

# Get parameter importance estimates.
df_imp = inf.get_parameter_errors()

# Get prediction intervals.
df_int = inf.get_intervals(X)

Features

  1. Model parameter significance testing: Tests whether the given model parameters are truly significant or not.

    For ensemble models, it can inform if given features are truly important or if they just seem so due to the instability of the model.

  2. Prediction intervals: Can produce prediction and confidence intervals for parametric and non-parametric ML models.

  3. Error propagation: Propagates error from input / model parameters to the outputs.

  4. Non-Linear regression: Scikit-learn-style API to fit non-linear models.

Installation

Dependencies

Python versions: See badges above.
Packages: See requirements.txt.

User installation

See ./docs/installation.md.

Examples

To run the examples, some additional plots and calculations need to be made which require other packages. These can be installed using:

pip install matplotlib seaborn jupyter scikit-fda

Check out some of the these examples to try out the package. These examples are best run in VS code.

Theoretical foundations

Discussion about the theory used can be found here:

Benchmarking

NonLinearRegression, ParametricModelInference, and ErrorPropagation classes have been benchmarked against the Python statsmodels package. The codes for this can be found here.

To run these benchmarking codes, please install statsmodels using:

pip install statsmodels==0.14.0

The EnsembleModelInference does not have a code to benchmark it against to the best of my knowledge. However, the code follows the ideas developed in the work by Zhang et al. (2020). The test is that a $(1-\alpha)\times100$ % prediction interval must contain $(1-\alpha)$ proportion of the training data. See benchmarking codes here.

Credits

  1. This package was created with Cookiecutter_ and the audreyr/cookiecutter-pypackage_ project template.

    Cookiecutter

    audreyr/cookiecutter-pypackage

  2. Some functions in ParametricModelInference are adopted from a Github repo by sriki18.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ml_uncertainty-0.1.1.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ml_uncertainty-0.1.1-py2.py3-none-any.whl (36.5 kB view details)

Uploaded Python 2Python 3

File details

Details for the file ml_uncertainty-0.1.1.tar.gz.

File metadata

  • Download URL: ml_uncertainty-0.1.1.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ml_uncertainty-0.1.1.tar.gz
Algorithm Hash digest
SHA256 71712f81d38e8b2fad953d78d492e1455ddbd3c957c0c34be64b4d880d1fb6d2
MD5 45cc4482065a6d6fe053061a86890ee1
BLAKE2b-256 db647e14b24ef57c0f3893325b9df1439e083342b3a42974038357c813bc5ddb

See more details on using hashes here.

File details

Details for the file ml_uncertainty-0.1.1-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for ml_uncertainty-0.1.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 1e34e225712871c22b14d2c3b5644d53ac8b5adc31d316cac152a6a3c7ac79e3
MD5 c6877bca04036d61f5decea9ddef0be8
BLAKE2b-256 8e04221faeec084fe9bbf66d6af6edd3e3aefec5dfa1aad9851ff8788e8463a9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page