Skip to main content

Implements several boosting algorithms in Python

Project description

KTBoost - A Python Package for Boosting

This Python package implements several boosting algorithms with different combinations of base learners, optimization algorithms, and loss functions.

Description

Concerning base learners, KTboost includes:

  • Trees
  • Reproducing kernel Hilbert space (RKHS) ridge regression functions (i.e., posterior means of Gaussian processes)
  • A combination of the two (i.e., the KTBoost algorithm)

Concerning the optimization step for finding the boosting updates, the package supports:

  • Gradient descent
  • Newton method (if applicable)
  • A hybrid version of the two for trees as base learners

The package implements the following loss functions:

  • Continuous data ("regression"): quadratic loss (L2 loss), absolute error (L1 loss), Huber loss, quantile regression loss, Gamma regression loss, negative Gaussian log-likelihood with both the mean and the standard deviation as functions of features
  • Count data ("regression"): Poisson regression loss
  • (Unorderd) Categorical data ("classification"): logistic regression loss (log loss), exponential loss, cross entropy loss with softmax
  • Mixed continuous-categorical data ("censored regression"): negative Tobit likelihood (i.e., the Grabit model)

Installation

It can be installed using

pip install -U KTBoost

and then loaded using

import KTBoost.KTBoost as KTBoost

Usage and examples

The package re-uses code from scikit-learn and its workflow is very similar to that of scikit-learn.

The two main classes are KTBoost.BoostingClassifier and KTBoost.BoostingRegressor.

The following code example defines models, trains them, and makes predictions.

import KTBoost.KTBoost as KTBoost

################################################
## Define model (see below for more examples) ##
################################################
## Standard tree boosting for regression with quadratic loss and hybrid gradient-Newton updates as in Friedman (2001)
model = KTBoost.BoostingRegressor(loss='ls')

##################
## Train models ##
##################
model.fit(Xtrain,ytrain)

######################
## Make predictions ##
######################
model.predict(Xpred)

#############################
## More examples of models ##
#############################
## Boosted Tobit model, i.e. Grabit model (Sigrist and Hirnschall, 2017), 
## with lower and upper limits at 0 and 100
model = KTBoost.BoostingRegressor(loss='tobit',yl=0,yu=100)
## KTBoost algorithm (combined kernel and tree boosting) for classification with Newton updates
model = KTBoost.BoostingClassifier(loss='deviance',base_learner='combined',
                                    update_step='newton',theta=1)
## Gradient boosting for classification with trees as base learners
model = KTBoost.BoostingClassifier(loss='deviance',update_step='gradient')
## Newton boosting for classification model with trees as base learners
model = KTBoost.BoostingClassifier(loss='deviance',update_step='newton')
## Hybrid gradient-Newton boosting (Friedman, 2001) for classification with 
## trees as base learners (this is the version that scikit-learn implements)
model = KTBoost.BoostingClassifier(loss='deviance',update_step='hybrid')
## Kernel boosting for regression with quadratic loss
model = KTBoost.BoostingRegressor(loss='ls',base_learner='kernel',theta=1)
## Kernel boosting with the Nystroem method and the range parameter theta chosen 
## as the average distance to the 100-nearest neighbors (of the Nystroem samples)
model = KTBoost.BoostingRegressor(loss='ls',base_learner='kernel',nystroem=True,
                                  n_components=1000,theta=None,n_neighbors=100)
## Regression model where both the mean and the standard deviation depend 
## on the covariates / features
model = KTBoost.BoostingRegressor(loss='msr')

Author

Fabio Sigrist

References

  • Friedman, J., Hastie, T., & Tibshirani, R. (2000). Additive logistic regression: a statistical view of boosting. The annals of statistics, 28(2), 337-407.
  • Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of statistics, 1189-1232.
  • Sigrist, F. (2018). Gradient and Newton Boosting for Classification and Regression. arXiv preprint arXiv:1808.03064.
  • Sigrist, F., & Hirnschall, C. (2017). Grabit: Gradient Tree Boosted Tobit Models for Default Prediction. arXiv preprint arXiv:1711.08695.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

KTBoost-0.0.14.tar.gz (47.5 kB view details)

Uploaded Source

Built Distributions

KTBoost-0.0.15-py2-none-any.whl (60.6 kB view details)

Uploaded Python 2

KTBoost-0.0.14-py2-none-any.whl (51.6 kB view details)

Uploaded Python 2

File details

Details for the file KTBoost-0.0.14.tar.gz.

File metadata

  • Download URL: KTBoost-0.0.14.tar.gz
  • Upload date:
  • Size: 47.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/2.7.14

File hashes

Hashes for KTBoost-0.0.14.tar.gz
Algorithm Hash digest
SHA256 68be54299111ae7c793d552b0448f90b5acbf889f6179dca50959f4e57e8451a
MD5 f7b060db1aacd5d0924fe032836bfbd9
BLAKE2b-256 ccd9d6f136d1cee5c12c8b9b8378179d781ac05f12ee2647379a289733643355

See more details on using hashes here.

File details

Details for the file KTBoost-0.0.15-py2-none-any.whl.

File metadata

  • Download URL: KTBoost-0.0.15-py2-none-any.whl
  • Upload date:
  • Size: 60.6 kB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/2.7.14

File hashes

Hashes for KTBoost-0.0.15-py2-none-any.whl
Algorithm Hash digest
SHA256 d1c056717332ff5d90d6acb356b191592aaab89d10c88dd89800316ea43fe4a7
MD5 ea83cb6646182bc847eb7f5bba071e2b
BLAKE2b-256 0b109cb97b4cdc654d827d5c1994e1f84f5a274f060b20c64350df7e3da358d2

See more details on using hashes here.

File details

Details for the file KTBoost-0.0.14-py2-none-any.whl.

File metadata

  • Download URL: KTBoost-0.0.14-py2-none-any.whl
  • Upload date:
  • Size: 51.6 kB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/2.7.14

File hashes

Hashes for KTBoost-0.0.14-py2-none-any.whl
Algorithm Hash digest
SHA256 8ea0920c5a9a1dfb9c3b130d8a460f77da1d0cf3d685eb62421beec0eadc2376
MD5 83b370d00645d867f73c3d98ffb5a01a
BLAKE2b-256 2e78d9593fb9baacd468116fcd7096719fd308c67b40b284619f090e403af31e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page