Skip to main content

hgboost is a python package for hyperparameter optimization for xgboost, catboost and lightboost for both classification and regression tasks.

Project description

hgboost - Hyperoptimized Gradient Boosting

Python PyPI Version License Github Forks GitHub Open Issues Project Status Downloads Downloads DOI Sphinx Open In Colab

hgboost is short for Hyperoptimized Gradient Boosting and is a python package for hyperparameter optimization for xgboost, catboost and lightboost using cross-validation, and evaluating the results on an independent validation set. hgboost can be applied for classification and regression tasks.

hgboost is fun because:

* 1. Hyperoptimization of the Parameter-space using bayesian approach.
* 2. Determines the best scoring model(s) using k-fold cross validation.
* 3. Evaluates best model on independent evaluation set.
* 4. Fit model on entire input-data using the best model.
* 5. Works for classification and regression
* 6. Creating a super-hyperoptimized model by an ensemble of all individual optimized models.
* 7. Return model, space and test/evaluation results.
* 8. Makes insightful plots.

⭐️ Star this repo if you like it ⭐️

Documentation pages

On the documentation pages you can find detailed information about the working of the hgboost with many examples.

Colab Notebooks

  • Open regression example In Colab Regression example

  • Open classification example In Colab Classification example

Schematic overview of hgboost

Installation Environment

conda create -n env_hgboost python=3.8
conda activate env_hgboost

Install from pypi

pip install hgboost
pip install -U hgboost # Force update

Import hgboost package

import hgboost as hgboost

Examples

Classification example for xgboost, catboost and lightboost:

# Load library
from hgboost import hgboost

# Initialization
hgb = hgboost(max_eval=10, threshold=0.5, cv=5, test_size=0.2, val_size=0.2, top_cv_evals=10, random_state=42)

# Fit xgboost by hyperoptimization and cross-validation
results = hgb.xgboost(X, y, pos_label='survived')

# [hgboost] >Start hgboost classification..
# [hgboost] >Collecting xgb_clf parameters.
# [hgboost] >Number of variables in search space is [11], loss function: [auc].
# [hgboost] >method: xgb_clf
# [hgboost] >eval_metric: auc
# [hgboost] >greater_is_better: True
# [hgboost] >pos_label: True
# [hgboost] >Total dataset: (891, 204) 
# [hgboost] >Hyperparameter optimization..
#  100% |----| 500/500 [04:39<05:21,  1.33s/trial, best loss: -0.8800619834710744]
# [hgboost] >Best performing [xgb_clf] model: auc=0.881198
# [hgboost] >5-fold cross validation for the top 10 scoring models, Total nr. tests: 50
# 100%|██████████| 10/10 [00:42<00:00,  4.27s/it]
# [hgboost] >Evalute best [xgb_clf] model on independent validation dataset (179 samples, 20.00%).
# [hgboost] >[auc] on independent validation dataset: -0.832
# [hgboost] >Retrain [xgb_clf] on the entire dataset with the optimal parameters settings.
# Plot the ensemble classification validation results
hgb.plot_validation()


References

* http://hyperopt.github.io/hyperopt/
* https://github.com/dmlc/xgboost
* https://github.com/microsoft/LightGBM
* https://github.com/catboost/catboost

Maintainers

Contribute

  • Contributions are welcome.

Licence See LICENSE for details.

Coffee

  • If you wish to buy me a Coffee for this work, it is very appreciated :)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hgboost-1.1.0.tar.gz (25.9 kB view details)

Uploaded Source

Built Distribution

hgboost-1.1.0-py3-none-any.whl (24.8 kB view details)

Uploaded Python 3

File details

Details for the file hgboost-1.1.0.tar.gz.

File metadata

  • Download URL: hgboost-1.1.0.tar.gz
  • Upload date:
  • Size: 25.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.11.1 keyring/23.5.0 rfc3986/1.5.0 colorama/0.4.4 CPython/3.8.12

File hashes

Hashes for hgboost-1.1.0.tar.gz
Algorithm Hash digest
SHA256 6f328d40abcab6052c1f399ed9a69297eb11efc345f117605b57e450c1276e16
MD5 ccf598268d64751c06021f08c281a104
BLAKE2b-256 d2587d485efd77dfc3749843bb81da4b1e5de2305f13f11f6b849388e809af54

See more details on using hashes here.

File details

Details for the file hgboost-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: hgboost-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 24.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.11.1 keyring/23.5.0 rfc3986/1.5.0 colorama/0.4.4 CPython/3.8.12

File hashes

Hashes for hgboost-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 755534b531e781611e74f6ec8691c86611520b4854edd9a84ec9934d1f4fad73
MD5 4cf725421ce3fc2c47054fbf7640e3e2
BLAKE2b-256 c2ff876d9a70e609f303cabfa97c1a775ab4784202de34558d8c9d667b548a21

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page