Skip to main content

Methods for online / incremental estimation of distributional regression models

Project description

ROLCH: Regularized Online Learning for Conditional Heteroskedasticity

Open Source Love License: MIT GitHub Release Downloads Tests Docs

Rename

rolch is an acronym for "Regularized Online Learning for Conditional Heteroskedasticity". As distributional regression is more than conditinoal heteroskedasticity, we have decided to rename the package to ondil. This ist the last release under the name rolch. The new package will be available under the name ondil on PyPI and GitHub.

Introduction

This package provides an online estimation of models for distributional regression, respectively, models for conditional heteroskedastic data. The main contribution is an online/incremental implementation of the generalized additive models for location, shape and scale (GAMLSS, see Rigby & Stasinopoulos, 2005) developed in Hirsch, Berrisch & Ziel, 2024.

Please have a look at the documentation or the example notebook.

We're actively working on the package and welcome contributions from the community. Have a look at the Release Notes and the Issue Tracker.

Distributional Regression

The main idea of distributional regression (or regression beyond the mean, multiparameter regression) is that the response variable $Y$ is distributed according to a specified distribution $\mathcal{F}(\theta)$, where $\theta$ is the parameter vector for the distribution. In the Gaussian case, we have $\theta = (\theta_1, \theta_2) = (\mu, \sigma)$. We then specify an individual regression model for all parameters of the distribution of the form

$$g_k(\theta_k) = \eta_k = X_k\beta_k$$

where $g_k(\cdot)$ is a link function, which ensures that the predicted distribution parameters are in a sensible range (we don't want, e.g. negative standard deviations), and $\eta_k$ is the predictor. For the Gaussian case, this would imply that we have two regression equations, one for the mean (location) and one for the standard deviation (scale) parameters. Distributions other than the normal distribution are possible, and we have already implemented them, e.g., Student's $t$-distribution and Johnson's $S_U$ distribution. If you are interested in another distribution, please open an Issue.

This allows us to specify very flexible models that consider the conditional behaviour of the variable's volatility, skewness and tail behaviour. A simple example for electricity markets is wind forecasts, which are skewed depending on the production level - intuitively, there is a higher risk of having lower production if the production level is already high since it cannot go much higher than "full load" and if, the turbines might cut-off. Modelling these conditional probabilistic behaviours is the key strength of distributional regression models.

Example

Basic estimation and updating procedure:

import rolch
import numpy as np
from sklearn.datasets import load_diabetes

X, y = load_diabetes(return_X_y=True)

# Model coefficients 
equation = {
    0 : "all", # Can also use "intercept" or np.ndarray of integers / booleans
    1 : "all", 
    2 : "all", 
}

# Create the estimator
online_gamlss_lasso = rolch.OnlineGamlss(
    distribution=rolch.DistributionT(),
    method="lasso",
    equation=equation,
    fit_intercept=True,
    ic="bic",
)

# Initial Fit
online_gamlss_lasso.fit(
    X=X[:-11, :], 
    y=y[:-11], 
)
print("Coefficients for the first N-11 observations \n")
print(online_gamlss_lasso.beta)

# Update call
online_gamlss_lasso.update(
    X=X[[-11], :], 
    y=y[[-11]]
)
print("\nCoefficients after update call \n")
print(online_gamlss_lasso.beta)

# Prediction for the last 10 observations
prediction = online_gamlss_lasso.predict(
    X=X[-10:, :]
)

print("\n Predictions for the last 10 observations")
# Location, scale and shape (degrees of freedom)
print(prediction)

Installation & Dependencies

The package is available from pypi - do pip install rolch and enjoy.

ROLCH is designed to have minimal dependencies. We rely on python>=3.10, numpy, numba and scipy in a reasonably up-to-date versions.

Authors

  • Simon Hirsch, University of Duisburg-Essen & Statkraft
  • Jonathan Berrisch, University of Duisburg-Essen
  • Florian Ziel, University of Duisburg-Essen

Acknowledgements & Disclosure

Simon is employed at Statkraft and gratefully acknowledges support received from Statkraft for his PhD studies. This work contains the author's opinion and not necessarily reflects Statkraft's position.

Contributing

We welcome every contribution from the community. Feel free to open an issue if you find bugs or want to propose changes.

We're still in an early phase and welcome feedback, especially on the usability and "look and feel" of the package. Secondly, we're working to port distributions from the R-GAMLSS package and welcome according PRs.

To get started, just create a fork and get going. We will modularize the code over the next versions and increase our testing coverage. We use ruff and black as formatters.

Install from source:

  1. Clone this repo.
  2. Install the necessary dependencies from the requirements.txt using conda create --name <env> --file requirements.txt.
  3. Run python3 -m build to build the wheel.
  4. Run pip install dist/rolch-0.1.0-py3-none-any.whl with the accurate version. If necessary, append --force-reinstall
  5. Enjoy.O

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rolch-0.2.5.tar.gz (46.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rolch-0.2.5-py3-none-any.whl (54.5 kB view details)

Uploaded Python 3

File details

Details for the file rolch-0.2.5.tar.gz.

File metadata

  • Download URL: rolch-0.2.5.tar.gz
  • Upload date:
  • Size: 46.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for rolch-0.2.5.tar.gz
Algorithm Hash digest
SHA256 484a9a444a3726b79339760a68dd386499d9c2f10ff1c6cd4f9449ddb28ed07b
MD5 42d3bcc0a93b3b0cbd8c09d39044e0f3
BLAKE2b-256 34fc372df0092a7847293eb95a5c72909480b9f6ad24f7012496e389cf017291

See more details on using hashes here.

File details

Details for the file rolch-0.2.5-py3-none-any.whl.

File metadata

  • Download URL: rolch-0.2.5-py3-none-any.whl
  • Upload date:
  • Size: 54.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for rolch-0.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 96fcfa396924b58afe9b4533240238c560b39bb519b3e72c620d0540e03faffe
MD5 21c3a4f558ce8eab077f247160d41ebd
BLAKE2b-256 f4410891b23e8583dbe78941ac3d3dc795db78c0dcd8254f1b8ec7ed0d95e242

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page