Skip to main content

Uncertainty Tuning (UTuning) is a package that focuses on summarizing uncertainty model performance for optimum hyperparameter tuning by using the uncertainty model goodness metric.

Project description

Hyperparameter Uncertainty Tuning

Uncertainty Tuning (UTuning) is a package that focuses on summarizing uncertainty model performance for optimum hyperparameter tuning.

This library uses the metric proposed by Maldonado-Cruz and Pyrcz (2021) to tune model hyperparameters based on the uncertainty model goodness metric. Maldonado-Cruz, E., & Pyrcz, M. J. (2021). Tuning machine learning dropout for subsurface uncertainty model accuracy. Journal of Petroleum Science and Engineering, 205, 108975. https://doi.org/https://doi.org/10.1016/j.petrol.2021.108975

In the figure we show a comparison of the cross-validation plot and respective accuracy plot for two uncertainty models where the hyperparameters were optimized using different objective functions. a) Using MAE, b) Uncertainty model goodness. Both models have a high Pearson's correlation coefficient yet model in b) is a better uncertainty model.

Features

This is what UTuning has to offer:

  • Hyperparameter tuning for ensemble based uncertainty models
  • Robust uncertainty evaluation
  • Evaluation of uncertainty models

Installation

Dependencies

  • numpy (>=1.16)
  • scikit-learn (>=0.23)

User Installation

pip install UTuning

Examples

Tune Machine Learning uncertainty model with GridSearchCV

In this first example we use Catboost as ensemble learner for predictions of production.

For this notebook example we have a problem that consists on predicting permeability from porosity and acoustic impedance data. We have selected this problem because we are primarily interested in capturing the uncertainty related to predictions of permeability based on existing data. This problem can be expanded to any prediction problem.

To start out, change our import statement to get UTuning grid search cross validation interface, and the rest is almost identical!

"""
Created on Mon Sep 20 16:15:37 2021
@author: em42363
"""
from UTuning import scorer, plots, UTSearch

from catboost import CatBoostRegressor ## Decision-tree based gradient boosting
# Prediction model in the form of an ensemble of weak prediction models

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler

import pandas as pd
import numpy as np

df = pd.read_csv("https://raw.githubusercontent.com/emaldonadocruz/UTuning/master/dataset/unconv_MV.csv") #

# %% Split train test
'''
Perform split train test, and perform data min-max normalization
'''

y = df['Production'].values
X = df[['Por', 'LogPerm', 'Brittle', 'TOC']].values

scaler = MinMaxScaler()
scaler.fit(X)
Xs = scaler.transform(X)
ys = (y - y.min())/ (y.max()-y.min())

X_train, X_test, y_train, y_test = train_test_split(Xs, ys, test_size=0.33)

print(X_train.shape, y_train.shape)

# %% Model creation
'''
We define the model and the grid search space,
we pass the model and the grid search.
'''
n_estimators = np.arange(180, 220, step=1) #80 150
lr = np.arange(0.035, 0.06, step=.001) #0.1 0.15
param_grid = {
    "learning_rate": list(lr),
    "n_estimators": list(n_estimators)
}

model = CatBoostRegressor(loss_function='RMSEWithUncertainty',
                          verbose=False)

random_cv = UTSearch.Grid(model, param_grid, 2)

random_cv.fit(X_train, y_train)
# %%Surface
'''
Similarly as in the problem with neural networks we can evaluate the
hyperparameter search space and use UTuning to construct the surface
'''
df = pd.DataFrame(random_cv.cv_results_)

labels = {'x': 'n estimators',
          'y': 'Learning rate',
          'z': 'Model goodness'}

plots.surface(df['param_n_estimators'],
              df['param_learning_rate'],
              df['split0_test_score'],
              30,
              labels)

A second example using neural networks is coming soon.

Credits


The dataset used for the examples is provided by Dr. Michael Pyrcz, GeostatsGuy: https://github.com/GeostatsGuy

This package was created with Cookiecutter_ and the audreyr/cookiecutter-pypackage_ project template.

Cookiecutter: https://github.com/audreyr/cookiecutter audreyr/cookiecutter-pypackage: https://github.com/audreyr/cookiecutter-pypackage

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

UTuning-0.1.3.tar.gz (16.9 kB view details)

Uploaded Source

File details

Details for the file UTuning-0.1.3.tar.gz.

File metadata

  • Download URL: UTuning-0.1.3.tar.gz
  • Upload date:
  • Size: 16.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.12

File hashes

Hashes for UTuning-0.1.3.tar.gz
Algorithm Hash digest
SHA256 0040a7e996e70b2cd1f6adde0c415e009937719969e9e15be6b758e02c893c3d
MD5 35564804b98ffb623c2707c5a6ce9fd3
BLAKE2b-256 4b11a7770f04773ae3fe94374182b00d5f30f1a381865ff6d9aaed7e4457428c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page