Bringing back uncertainty to machine learning.
Project description
Doubt
Bringing back uncertainty to machine learning.
A Python package to include prediction intervals in the predictions of machine learning models, to quantify their uncertainty.
Installation
If you do not already have HDF5 installed, then start by installing that. On MacOS this
can be done using sudo port install hdf5
after
MacPorts have been installed. On Ubuntu you can
get HDF5 with sudo apt-get install python-dev python3-dev libhdf5-serial-dev
. After
that, you can install doubt
with pip
:
pip install doubt
Features
- Bootstrap wrapper for all Scikit-Learn models
- Can also be used to calculate usual bootstrapped statistics of a dataset
- Quantile Regression for all generalised linear models
- Quantile Regression Forests
- A uniform dataset API, with 24 regression datasets and counting
Quick Start
If you already have a model in Scikit-Learn, then you can simply
wrap it in a Boot
to enable predicting with prediction intervals:
>>> from sklearn.linear_model import LinearRegression
>>> from doubt import Boot
>>> from doubt.datasets import PowerPlant
>>>
>>> X, y = PowerPlant().split()
>>> clf = Boot(LinearRegression())
>>> clf = clf.fit(X, y)
>>> clf.predict([10, 30, 1000, 50], uncertainty=0.05)
(481.9203102126274, array([473.43314309, 490.0313962 ]))
Alternatively, you can use one of the standalone models with uncertainty
outputs. For instance, a QuantileRegressionForest
:
>>> from doubt import QuantileRegressionForest as QRF
>>> from doubt.datasets import Concrete
>>> import numpy as np
>>>
>>> X, y = Concrete().split()
>>> clf = QRF(max_leaf_nodes=8)
>>> clf.fit(X, y)
>>> clf.predict(np.ones(8), uncertainty=0.25)
(16.933590347847982, array([ 8.93456428, 26.0664534 ]))
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.