Skip to main content

A high-performance gradient boosting implementation using Cython

Project description

CyBooster: A Gradient Boosting Library

CyBooster is a high-performance generic gradient boosting (any based learner can be used) library designed for classification and regression tasks. It is built on Cython for speed and efficiency, making it suitable for large datasets and complex models.

Each base learner is augmented with a randomized neural network (a generalization of https://www.researchgate.net/publication/346059361_LSBoost_gradient_boosted_penalized_nonlinear_least_squares to any base learner), which allows the model to learn complex patterns in the data. The library supports both classification and regression tasks, making it versatile for various machine learning applications.

CyBooster is born from mlsauce, that might be difficult to install on some systems. This version will also be more GPU friendly, thanks to JAX.

Installation

To install CyBooster, you can use pip:

pip install cybooster --verbose

From GitHub:

pip install git+https://github.com/Techtonique/cybooster.git --verbose

Usage

from cybooster import BoosterClassifier, BoosterRegressor
from sklearn.datasets import load_iris, load_diabetes, load_breast_cancer, load_digits, load_wine
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, mean_squared_error, root_mean_squared_error
from sklearn.linear_model import LinearRegression
from time import time 


# Regression Example
X, y = load_diabetes(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
regressor = BoosterRegressor(obj=LinearRegression(), n_estimators=100, learning_rate=0.1,
                             n_hidden_features=10, verbose=1, seed=42)
start = time()
regressor.fit(X_train, y_train)
y_pred = regressor.predict(X_test)
print(f"Elapsed: {time() - start} s")
rmse = root_mean_squared_error(y_test, y_pred)
print(f"RMSE for regression: {rmse:.4f}")

# Classification Example
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
classifier = BoosterClassifier(obj=LinearRegression(), n_estimators=100, learning_rate=0.1,
                               n_hidden_features=10, verbose=1, seed=42)
start = time()
try: 
    classifier.fit(X_train, y_train)
except Exception as e: # this is for Windows users
    y_train = y_train.astype('int32')
    classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
print(f"Elapsed: {time() - start} s")
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy for classification: {accuracy:.4f}")

X, y = load_wine(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
classifier = BoosterClassifier(obj=LinearRegression(), n_estimators=100, learning_rate=0.1,
                               n_hidden_features=10, verbose=1, seed=42)
start = time()
try:
    classifier.fit(X_train, y_train)
except Exception as e: # this is for Windows users
    y_train = y_train.astype('int32')
    classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
print(f"Elapsed: {time() - start} s")
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy for classification: {accuracy:.4f}")

X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
classifier = BoosterClassifier(obj=LinearRegression(), n_estimators=100, learning_rate=0.1,
                               n_hidden_features=10, verbose=1, seed=42)
start = time()
try: 
    classifier.fit(X_train, y_train)
except Exception as e: # this is for Windows users
    y_train = y_train.astype('int32')
    classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
print(f"Elapsed: {time() - start} s")
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy for classification: {accuracy:.4f}")

X, y = load_digits(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
classifier = BoosterClassifier(obj=LinearRegression(), n_estimators=100, learning_rate=0.1,
                               n_hidden_features=10, verbose=1, seed=42)
start = time()
try: 
    classifier.fit(X_train, y_train)
except Exception as e: # this is for Windows users
    y_train = y_train.astype('int32')
    classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
print(f"Elapsed: {time() - start} s")
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy for classification: {accuracy:.4f}")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cybooster-0.1.3.tar.gz (4.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cybooster-0.1.3-cp313-cp313-manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.13

File details

Details for the file cybooster-0.1.3.tar.gz.

File metadata

  • Download URL: cybooster-0.1.3.tar.gz
  • Upload date:
  • Size: 4.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for cybooster-0.1.3.tar.gz
Algorithm Hash digest
SHA256 06e044a33e652c2775a0d8b1d6bcf0bd7cb10889ce82a0e45d0129657c8edcf7
MD5 9b92e2bf6525f5744c7358391ab5ad1b
BLAKE2b-256 d80da99fb78d3cb5502a4f921d66aa3a5d9334636f0608fbe0471b9a9406b4ec

See more details on using hashes here.

File details

Details for the file cybooster-0.1.3-cp313-cp313-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for cybooster-0.1.3-cp313-cp313-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 88958bf28eb64cf2b66a41cc3167c05868be63eccc167b9c8bbd3c4867fa1768
MD5 5a3d330be7be596c603eef2f4a7ef568
BLAKE2b-256 4bd58d135725896f0dc5055d7354070c78b8b4176c24106a840915bc29183762

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page