Skip to main content

Probabilistic Gradient Boosting Machines

Project description

PGBM Airlab Amsterdam

PyPi version Python version GitHub license

Probabilistic Gradient Boosting Machines (PGBM) is a probabilistic gradient boosting framework in Python based on PyTorch/Numba, developed by Airlab in Amsterdam. It provides the following advantages over existing frameworks:

  • Probabilistic regression estimates instead of only point estimates. (example)
  • Auto-differentiation of custom loss functions. (example, example)
  • Native GPU-acceleration. (example)
  • Distributed training for CPU and GPU, across multiple nodes. (examples)
  • Ability to optimize probabilistic estimates after training for a set of common distributions, without retraining the model. (example)

It is aimed at users interested in solving large-scale tabular probabilistic regression problems, such as probabilistic time series forecasting.

For more details, read the docs or our paper or check out the examples.

Below a simple example using our sklearn wrapper:

from pgbm import PGBMRegressor
from sklearn.model_selection import train_test_split
from sklearn.datasets import fetch_california_housing
X, y = fetch_california_housing(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1)
model = PGBMRegressor().fit(X_train, y_train)  
yhat_point = model.predict(X_test)
yhat_dist = model.predict_dist(X_test)

Installation

See Installation section in our docs.

Support

In general, PGBM works similar to existing gradient boosting packages such as LightGBM or xgboost (and it should be possible to more or less use it as a drop-in replacement), except that it is required to explicitly define a loss function and loss metric.

In case further support is required, open an issue.

Reference

Olivier Sprangers, Sebastian Schelter, Maarten de Rijke. Probabilistic Gradient Boosting Machines for Large-Scale Probabilistic Regression. Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 21), August 14–18, 2021, Virtual Event, Singapore.

The experiments from our paper can be replicated by running the scripts in the experiments folder. Datasets are downloaded when needed in the experiments except for higgs and m5, which should be pre-downloaded and saved to the datasets folder (Higgs) and to datasets/m5 (m5).

License

This project is licensed under the terms of the Apache 2.0 license.

Acknowledgements

This project was developed by Airlab Amsterdam.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pgbm-1.7.1.tar.gz (55.6 kB view details)

Uploaded Source

Built Distribution

pgbm-1.7.1-py3-none-any.whl (60.4 kB view details)

Uploaded Python 3

File details

Details for the file pgbm-1.7.1.tar.gz.

File metadata

  • Download URL: pgbm-1.7.1.tar.gz
  • Upload date:
  • Size: 55.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.7

File hashes

Hashes for pgbm-1.7.1.tar.gz
Algorithm Hash digest
SHA256 d5ee8b851e16dae73e97ebeb9f7670114cb73b1b24138bef178acefe886b09e3
MD5 9ed3438db0b129c5cc73d602a41de547
BLAKE2b-256 a0d850c8b12e455f652d5b3a367c1e444ac6a8ee8c9b7f0a31bf5495da083afa

See more details on using hashes here.

File details

Details for the file pgbm-1.7.1-py3-none-any.whl.

File metadata

  • Download URL: pgbm-1.7.1-py3-none-any.whl
  • Upload date:
  • Size: 60.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.7

File hashes

Hashes for pgbm-1.7.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e5759a4fa7a9e989eaca8f30699bee274a25950ce8e774ecb018d9afeed95bc1
MD5 fde33fca07485a3c45bc0b4f2d13c94c
BLAKE2b-256 848c3a4ab8bb0fb7f165c0b3951af22be975fd1e88fc5692499d9f15f423dcf3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page