Bayesian regression for low-noise data with misspecification uncertainty (POPS algorithm).
Project description
popsregression
popsregression is a scikit-learn compatible
package providing POPSRegression, a Bayesian regression method for low-noise
data that accounts for model misspecification uncertainty.
Try it out! online demo from Kermode group comparing multiple regression schemes.
Misspecification-aware Bayesian regression
Standard Bayesian regression (e.g. BayesianRidge) estimates epistemic and
aleatoric uncertainties, but provably ignore model misspecification- errors arising from limited model form (see example below). In the low-noise (weak aleatoric / near-deterministic) limit, weight uncertainties (sigma_) are significantly underestimated as they only capture epistemic uncertainty, which decays with increasing data. Any remaining error is attributed to aleatoric noise (alpha_), which is erroneous in low-noise settings.
POPSRegression efficiently estimates model misspecification uncertainty
via the Pointwise Optimal Parameter Sets (POPS) algorithm, finidng parameter perturbations that would fit each training point exactly.
The result is wider, more honest uncertainty estimates that properly cover the true function, even when the model class cannot perfectly represent the target.
The misspecified, near-deterministic regression problem that POPSRegression addresses is particularly relevant to the fitting of surrogate simulation models in computational science, i.e. interatomic potentials,where by construction the optimal surrogate model is structurally unable to capture the target function exactly.
Example
Fitting a quartic polynomial (P=5 parameters) to a complex oscillatory function with N=10, 50, 500 training points. Top row: BayesianRidge epistemic uncertainty vanishes with more data. Bottom rows: POPS correctly maintains uncertainty where the polynomial deviates from the truth.
See the SimpleExample.ipynb notebook for a runnable version.
Installation
pip install popsregression
Dependencies: scikit-learn >= 1.6.1, scipy >= 1.6.0, numpy >= 1.20.0
Quick start
from popsregression import POPSRegression
X_train, X_test, y_train, y_test = ...
# Fit POPSRegression
# fit_intercept=False by default
model = POPSRegression()
model.fit(X_train, y_train)
# Prediction with misspecification & epistemic uncertainty
y_pred, y_std = model.predict(X_test, return_std=True)
# Also return min/max bounds over the posterior
y_pred, y_std, y_max, y_min = model.predict(
X_test, return_std=True, return_bounds=True
)
# Also return epistemic-only uncertainty separately
y_pred, y_std, y_max, y_min, y_epistemic_std = model.predict(
X_test,
return_std=True,
return_bounds=True,
return_epistemic_std=True,
)
Key parameters
| Parameter | Default | Description |
|---|---|---|
posterior |
'hypercube' |
Posterior form: 'hypercube' (PCA-aligned box) or 'ensemble' (raw corrections) |
resampling_method |
'uniform' |
Sampling method: 'uniform', 'sobol', 'latin', 'halton' |
resample_density |
1.0 |
Number of posterior samples per training point |
leverage_percentile |
50.0 |
Only use high-leverage training points for POPS posterior |
mode_threshold |
1e-8 |
Eigenvalue threshold for hypercube dimensionality |
percentile_clipping |
0.0 |
Percentile to clip from hypercube bounds (0–50) |
All BayesianRidge parameters (max_iter, tol, alpha_1, alpha_2,
lambda_1, lambda_2, fit_intercept, etc.) are also supported.
Key attributes (after fitting)
| Attribute | Description |
|---|---|
coef_ |
Regression coefficients (posterior mean) |
sigma_ |
Epistemic variance-covariance matrix |
misspecification_sigma_ |
Misspecification variance-covariance matrix from POPS |
posterior_samples_ |
Samples from the POPS posterior |
alpha_ |
Estimated noise precision (not used for prediction) |
Pipeline compatibility
POPSRegression is fully compatible with scikit-learn pipelines and
hyperparameter search:
from sklearn.pipeline import make_pipeline
pipe = make_pipeline(
PolynomialFeatures(degree=4),
POPSRegression(resampling_method='sobol'),
)
pipe.fit(X_train, y_train)
y_pred = pipe.predict(X_test)
scikit-learn style documentation
https://tomswinburne.github.io/popsregression
Development
# Install in development mode
pip install -e .
# Run tests
pytest -vsl popsregression
# With pixi
pixi run test
pixi run lint
pixi run build-doc
Citation
Parameter uncertainties for imperfect surrogate models in the low-noise regime
TD Swinburne and D Perez, Machine Learning: Science and Technology 2025
@article{swinburne2025,
author={Swinburne, Thomas and Perez, Danny},
title={Parameter uncertainties for imperfect surrogate models in the low-noise regime},
journal={Machine Learning: Science and Technology},
doi={10.1088/2632-2153/ad9fce},
year={2025}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file popsregression-0.4.0.tar.gz.
File metadata
- Download URL: popsregression-0.4.0.tar.gz
- Upload date:
- Size: 260.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4f99efd1a0a58b83729b7b2817612db832012040895314a539af65fcba426f7c
|
|
| MD5 |
ab16f31726519977749e8fea3c033957
|
|
| BLAKE2b-256 |
c6ccdf9dc94b3d485916893af5e0006da019f88ee8242b8b35d0237339aa0317
|
Provenance
The following attestation bundles were made for popsregression-0.4.0.tar.gz:
Publisher:
publish.yml on tomswinburne/popsregression
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
popsregression-0.4.0.tar.gz -
Subject digest:
4f99efd1a0a58b83729b7b2817612db832012040895314a539af65fcba426f7c - Sigstore transparency entry: 1112146729
- Sigstore integration time:
-
Permalink:
tomswinburne/popsregression@ef9cd40d4077fb82943df15ff4d98141a47810ca -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/tomswinburne
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ef9cd40d4077fb82943df15ff4d98141a47810ca -
Trigger Event:
release
-
Statement type:
File details
Details for the file popsregression-0.4.0-py3-none-any.whl.
File metadata
- Download URL: popsregression-0.4.0-py3-none-any.whl
- Upload date:
- Size: 16.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aa7e3872d061fb35953f34bb57d7c05948a3b8c2969bfc7a63f34b5ed8e205a0
|
|
| MD5 |
4448bb5ec9ddfc8d457b0f94043e57c2
|
|
| BLAKE2b-256 |
f0bf20d0255ab0667b7659d2a2375143d4b12ea4cd9e486272a26bceaf3bcbaa
|
Provenance
The following attestation bundles were made for popsregression-0.4.0-py3-none-any.whl:
Publisher:
publish.yml on tomswinburne/popsregression
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
popsregression-0.4.0-py3-none-any.whl -
Subject digest:
aa7e3872d061fb35953f34bb57d7c05948a3b8c2969bfc7a63f34b5ed8e205a0 - Sigstore transparency entry: 1112146740
- Sigstore integration time:
-
Permalink:
tomswinburne/popsregression@ef9cd40d4077fb82943df15ff4d98141a47810ca -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/tomswinburne
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ef9cd40d4077fb82943df15ff4d98141a47810ca -
Trigger Event:
release
-
Statement type: