Skip to main content

A small machine learning library for doing gradient boosting

Project description

logo


Please check out the website if you're looking for the documentation!

What is this?

This is StarBoost, a Python library that implements gradient boosting. Gradient boosting is an efficient and popular machine learning algorithm used for supervised learning.

Doesn't scikit-learn already do that?

Indeed scikit-learn implements gradient boosting, but the only supported weak learner is a decision tree. In essence gradient boosting can be used with other weak learners than decision trees.

What about XGBoost/LightGBM/CatBoost?

The mentioned libraries are the state of the art of gradient boosting decision trees (GBRT). They implement a specific version of gradient boosting that is tailored to decision trees. StarBoost's purpose isn't to compete with them. Instead it's goal is to implement a generic gradient boosting algorithm that works with any weak learner.

A focus of StarBoost is to keep the code readable and commented, instead of obfuscating the algorithm under a pile of tangled code.

What's a weak learner?

A weak learner is any machine learning model that can learn from labeled data. It's called "weak" because it usually works better as part of an ensemble (such as gradient boosting). Examples are linear models, radial basis functions, decision trees, genetic programming, neural networks, etc. In theory you could even use gradient boosting as a weak learner.

Is it compatible with scikit-learn?

Yes, it is.

How do I install it?

Barring any weird Python setup, you simply have to run pip install starboost.

How do I use it?

The following snippet shows a very basic usage of StarBoost. Please check out the examples directory for comprehensive examples.

from sklearn import datasets
from sklearn import tree
import starboost as sb

X, y = datasets.load_boston(return_X_y=True)

model = sb.BoostingRegressor(
    base_estimator=tree.DecisionTreeRegressor(max_depth=3),
    n_estimators=30,
    learning_rate=0.1
)

model = model.fit(X, y)

y_pred = model.predict(X)

You can find the source code for running the benchmarks here.

What are you planning on doing next?

  • Logging the progress
  • Handling sample weights
  • Implement more loss functions
  • Make it faster
  • Newton boosting (taking into account the information from the Hessian)
  • Learning to rank

By the way, why is it called "StarBoost"?

As you might already know, in programming the star symbol * often refers to the concept of "everything". The idea is that StarBoost can be used with any weak learner, not just decision trees.

License

The MIT License (MIT). Please see the LICENSE file for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

starboost-0.0.2.tar.gz (13.0 kB view details)

Uploaded Source

Built Distribution

starboost-0.0.2-py2.py3-none-any.whl (15.7 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file starboost-0.0.2.tar.gz.

File metadata

  • Download URL: starboost-0.0.2.tar.gz
  • Upload date:
  • Size: 13.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/39.1.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.6.5

File hashes

Hashes for starboost-0.0.2.tar.gz
Algorithm Hash digest
SHA256 10d88630ecef8b4442a667ede4dcf6c3d383f4f0f55af0d909726ae63ae9fd3e
MD5 e38d58acfd6a969cf82dd4f156d433fa
BLAKE2b-256 4661d915c3d3b505ed8c6a76507cae82269aa64ed6e164bcfc1fd7b6e4e2f682

See more details on using hashes here.

File details

Details for the file starboost-0.0.2-py2.py3-none-any.whl.

File metadata

  • Download URL: starboost-0.0.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 15.7 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/39.1.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.6.5

File hashes

Hashes for starboost-0.0.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 e35691117640b8dcfbeb3edc1b2ea5eddecfc33148775a2c7c04273457088249
MD5 b2c02fed3ee4f7cd6cae02f04f4fc04e
BLAKE2b-256 cb4d3f696e4ff21a029b677e0ae3104f3385973679d17e3cef234e775da00617

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page