Skip to main content

Python based GBDT

Project description

Py-boost: a research tool for exploring GBDTs

Modern gradient boosting toolkits are very complex and are written in low-level programming languages. As a result,

  • It is hard to customize them to suit one’s needs
  • New ideas and methods are not easy to implement
  • It is difficult to understand how they work

Py-boost is a Python-based gradient boosting library which aims at overcoming the aforementioned problems.

Authors: Anton Vakhrushev, Leonid Iosipoi , Sergey Kupriyanov.

Py-boost Key Features

Simple. Py-boost is a simplified gradient boosting library, but it supports all main features and hyperparameters available in other implementations.

Fast with GPU. Despite the fact that Py-boost is written in Python, it works only on GPU and uses Python GPU libraries such as CuPy and Numba.

Efficient inference. Since v0.4 Py-Boost is able to perform the efficient inference of tree ensembles on GPU. Moreover, ones your model is trained on GPU, it could be converted to perform the inference on CPU only machine via converting to the treelite format with build-in wrapper (limitation - model should be trained with target_splitter='Single', which is the default).

ONNX compatible Since v0.5 Py-Boost is compatible with ONNX format that allows more options the CPU inference and model deployment.

Easy to customize. Py-boost can be easily customized even if one is not familiar with GPU programming (just replace np with cp). What can be customized? Almost everything via custom callbacks. Examples: Row/Col sampling strategy, Training control, Losses/metrics, Multioutput handling strategy, Anything via custom callbacks

SketchBoost paper

Multioutput training. Current state-of-atr boosting toolkits provide very limited support of multioutput training. And even if this option is available, training time for such tasks as multiclass/multilabel classification and multitask regression is quite slow because of the training complexity that scales linearly with the number of outputs. To overcome the existing limitations we create SketchBoost algorithm that uses approximate tree structure search. As we show in paper that strategy at least does not lead to performance decrease and often is able to improve the accuracy

SketchBoost. You can try our sketching strategies by using SketchBoost class or if you want you can implement your own and pass to the GradientBoosting constructor as multioutput_sketch parameter. For the details please see Tutorial_2_Advanced_multioutput

Installation

Before installing py-boost via pip you should have cupy installed. You can use:

pip install -U cupy-cuda110 py-boost

Note: replace with your cuda version! For the details see this guide

Quick tour

Py-boost is easy to use since it has similar to scikit-learn interface. For usage example please see:

More examples are coming soon

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_boost-0.5.1.tar.gz (53.4 kB view details)

Uploaded Source

Built Distribution

py_boost-0.5.1-py3-none-any.whl (63.1 kB view details)

Uploaded Python 3

File details

Details for the file py_boost-0.5.1.tar.gz.

File metadata

  • Download URL: py_boost-0.5.1.tar.gz
  • Upload date:
  • Size: 53.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.14 Linux/5.4.0-126-generic

File hashes

Hashes for py_boost-0.5.1.tar.gz
Algorithm Hash digest
SHA256 c98d306cb0d5dd243dbc24edceccbcfd19589826cefa595f10e436b88fdcec74
MD5 80dc30ea9bc28efaeaeac3aecaec4477
BLAKE2b-256 7e4e2e77e43abd5f26760e31ac35411572c412b66d0786aa16d6635828ca56dc

See more details on using hashes here.

File details

Details for the file py_boost-0.5.1-py3-none-any.whl.

File metadata

  • Download URL: py_boost-0.5.1-py3-none-any.whl
  • Upload date:
  • Size: 63.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.14 Linux/5.4.0-126-generic

File hashes

Hashes for py_boost-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e9caa2fb17cd6d9aeff73167ca73e2a1ff74a057b70bf2a96caac51f47c5843c
MD5 0cb740484c62f14f5ad16afbb133f102
BLAKE2b-256 c602ac30b78dd17edf9554996b7b176b5c6e50cf546743aa14956aed7a6374f5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page