Bayesian regression for low-noise data using POPS algorithm
Project description
POPSRegression
Linear regression scheme from the paper
Parameter uncertainties for imperfect surrogate models in the low-noise regime
TD Swinburne and D Perez, Machine Learning: Science and Technology 2025
@article{swinburne2025,
author={Swinburne, Thomas and Perez, Danny},
title={Parameter uncertainties for imperfect surrogate models in the low-noise regime},
journal={Machine Learning: Science and Technology},
doi={10.1088/2632-2153/ad9fce},
year={2025}
}
Installation
There will be a PR on scikit-learn soon, but in the meantime
pip install POPSRegression
What is POPSRegression?
Bayesian regression for low-noise data (vanishing aleatoric uncertainty).
Fits the weights of a regression model using BayesianRidge, then estimates weight uncertainties (sigma_ in BayesianRidge) accounting for model misspecification using the POPS (Pointwise Optimal Parameter Sets) algorithm [1]. The alpha_ attribute which estimates aleatoric uncertainty is not used for predictions as correctly it should be assumed negligable.
Bayesian regression is often used in computational science to fit the weights of a surrogate model which approximates some complex calculation. In many important cases the target calculation is near-deterministic, or low-noise, meaning the true data has vanishing aleatoric uncertainty. However, there can be large misspecification uncertainty, i.e. the model weights are instrinsically uncertain as the model is unable to exactly match training data.
Existing Bayesian regression schemes based on loss minimization can only estimate epistemic and aleatoric uncertainties. In the low-noise limit, weight uncertainties (sigma_ in BayesianRidge) are significantly underestimated as they only account for epistemic uncertainties which decay with increasing data. Predictions then assume any additional error is due to an aleatoric uncertainty (alpha_ in BayesianRidge), which is erroneous in a low-noise setting. This has significant implications on how uncertainty is propagated using weight uncertainties.
Example usage
Here, usage follows sklearn.linear_model, inheriting BayesianRidge
After running BayesianRidge.fit(..), the alpha_ attribute is not used for predictions.
The sigma_ matrix still contains epistemic weight uncertainties, whilst misspecification_sigma_ contains the POPS uncertainties.
from POPSRegression import POPSRegression
X_train,X_test,y_train,y_test = ...
# Sobol resampling of hypercube with 1.0 samples / training point
model = POPSRegression(resampling_method='sobol',resample_density=1.)
# fit the model, sample POPS hypercube
model.fit(X_train,y_train)
# Return mean and hypercube std
y_pred, y_std = model.predict(X_test,return_std=True)
# can also return max/min
y_pred, y_std, y_max, y_min = model.predict(X_test,return_std=True,return_bounds=True)
# can also return the epistemic uncertainty seperately
y_pred, y_std, y_max, y_min, y_epistemic_std = model.predict(X_test,return_std=True,return_bounds=True,return_epistemic_std=True)
Toy example
Extreme low-dimensional case, fitting N data points to a quartic polynomial (P=5 parameters) to a complex oscillatory function.
Green: two sigma of sigma_ weight uncertainty from Bayesian Regression (i.e. without alpha_ term for aleatoric error)
Orange: two sigma of sigma_ and misspecification_sigma_ posterior from POPS Regression
Gray: min-max of posterior from POPS Regression
As can be seen, the final error bars give very good coverage of the test output
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file popsregression-0.3.2.tar.gz.
File metadata
- Download URL: popsregression-0.3.2.tar.gz
- Upload date:
- Size: 10.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
04bca5e7678e34401a705ae4425411b80bee8a98d490aae2d56b075a7ef86e1c
|
|
| MD5 |
b2d9893539f525267408d3b1c743b075
|
|
| BLAKE2b-256 |
bda440a1e614f78930397ee385ef808e881c2dc5732832a16506bdeb4ed3da22
|
File details
Details for the file POPSRegression-0.3.2-py3-none-any.whl.
File metadata
- Download URL: POPSRegression-0.3.2-py3-none-any.whl
- Upload date:
- Size: 9.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d2314d7283abc973af0736162523111c7c7dfd4776ac099d312e169d5a0d1676
|
|
| MD5 |
5a59ea1ef4d91cd25df4945eba41bb32
|
|
| BLAKE2b-256 |
f346918b2c2e22d47e307351043be038917f0ca3a3ed05bd6511305f6c277f41
|