A high-performance gradient boosting implementation using Cython
Project description
CyBooster: A Gradient Boosting Library
CyBooster is a high-performance generic gradient boosting (any based learner can be used) library designed for classification and regression tasks. It is built on Cython for speed and efficiency, making it suitable for large datasets and complex models.
Each base learner is augmented with a randomized neural network (a generalization of https://www.researchgate.net/publication/346059361_LSBoost_gradient_boosted_penalized_nonlinear_least_squares to any base learner), which allows the model to learn complex patterns in the data. The library supports both classification and regression tasks, making it versatile for various machine learning applications.
CyBooster is born from mlsauce, that might be difficult to install on some systems. This version will also be more GPU friendly, thanks to JAX.
Installation
To install CyBooster, you can use pip:
pip install cybooster --verbose
From GitHub:
pip install git+https://github.com/Techtonique/cybooster.git --verbose
Usage
from cybooster import BoosterClassifier, BoosterRegressor
from sklearn.datasets import load_iris, load_diabetes, load_breast_cancer, load_digits, load_wine
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, mean_squared_error, root_mean_squared_error
from sklearn.linear_model import LinearRegression
from time import time
# Regression Example
X, y = load_diabetes(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
regressor = BoosterRegressor(obj=LinearRegression(), n_estimators=100, learning_rate=0.1,
n_hidden_features=10, verbose=1, seed=42)
start = time()
regressor.fit(X_train, y_train)
y_pred = regressor.predict(X_test)
print(f"Elapsed: {time() - start} s")
rmse = root_mean_squared_error(y_test, y_pred)
print(f"RMSE for regression: {rmse:.4f}")
# Classification Example
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
classifier = BoosterClassifier(obj=LinearRegression(), n_estimators=100, learning_rate=0.1,
n_hidden_features=10, verbose=1, seed=42)
start = time()
try:
classifier.fit(X_train, y_train)
except Exception as e: # this is for Windows users
y_train = y_train.astype('int32')
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
print(f"Elapsed: {time() - start} s")
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy for classification: {accuracy:.4f}")
X, y = load_wine(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
classifier = BoosterClassifier(obj=LinearRegression(), n_estimators=100, learning_rate=0.1,
n_hidden_features=10, verbose=1, seed=42)
start = time()
try:
classifier.fit(X_train, y_train)
except Exception as e: # this is for Windows users
y_train = y_train.astype('int32')
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
print(f"Elapsed: {time() - start} s")
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy for classification: {accuracy:.4f}")
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
classifier = BoosterClassifier(obj=LinearRegression(), n_estimators=100, learning_rate=0.1,
n_hidden_features=10, verbose=1, seed=42)
start = time()
try:
classifier.fit(X_train, y_train)
except Exception as e: # this is for Windows users
y_train = y_train.astype('int32')
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
print(f"Elapsed: {time() - start} s")
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy for classification: {accuracy:.4f}")
X, y = load_digits(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
classifier = BoosterClassifier(obj=LinearRegression(), n_estimators=100, learning_rate=0.1,
n_hidden_features=10, verbose=1, seed=42)
start = time()
try:
classifier.fit(X_train, y_train)
except Exception as e: # this is for Windows users
y_train = y_train.astype('int32')
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
print(f"Elapsed: {time() - start} s")
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy for classification: {accuracy:.4f}")
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cybooster-0.1.3.tar.gz.
File metadata
- Download URL: cybooster-0.1.3.tar.gz
- Upload date:
- Size: 4.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
06e044a33e652c2775a0d8b1d6bcf0bd7cb10889ce82a0e45d0129657c8edcf7
|
|
| MD5 |
9b92e2bf6525f5744c7358391ab5ad1b
|
|
| BLAKE2b-256 |
d80da99fb78d3cb5502a4f921d66aa3a5d9334636f0608fbe0471b9a9406b4ec
|
File details
Details for the file cybooster-0.1.3-cp313-cp313-manylinux2014_x86_64.whl.
File metadata
- Download URL: cybooster-0.1.3-cp313-cp313-manylinux2014_x86_64.whl
- Upload date:
- Size: 1.5 MB
- Tags: CPython 3.13
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
88958bf28eb64cf2b66a41cc3167c05868be63eccc167b9c8bbd3c4867fa1768
|
|
| MD5 |
5a3d330be7be596c603eef2f4a7ef568
|
|
| BLAKE2b-256 |
4bd58d135725896f0dc5055d7354070c78b8b4176c24106a840915bc29183762
|