A fast GPU-accelerated Gradient Boosted Decision Tree library with PyTorch + CUDA

Project description

warpgbm

WarpGBM

WarpGBM is a high-performance, GPU-accelerated Gradient Boosted Decision Tree (GBDT) library built with PyTorch and CUDA. It offers blazing-fast histogram-based training and efficient prediction, with compatibility for research and production workflows.

Features

GPU-accelerated training and histogram construction using custom CUDA kernels
Drop-in scikit-learn style interface
Supports pre-binned data or automatic quantile binning
Simple install with pip

Performance Note

In our initial tests on an NVIDIA 3090 (local) and A100 (Google Colab Pro), WarpGBM achieves 14x to 20x faster training times compared to LightGBM using default configurations. It also consumes significantly less RAM and CPU. These early results hint at more thorough benchmarking to come.

Installation

Recommended (GitHub, always latest):

pip install git+https://github.com/jefferythewind/warpgbm.git

This installs the latest version directly from GitHub and compiles CUDA extensions on your machine using your local PyTorch and CUDA setup. It's the most reliable method for ensuring compatibility and staying up to date with the latest features.

Alternatively (PyPI, stable releases):

pip install warpgbm

This installs from PyPI and also compiles CUDA code locally during installation. This method works well if your environment already has PyTorch with GPU support installed and configured.

Tip:
If you encounter an error related to mismatched or missing CUDA versions, try installing with the following flag:
pip install warpgbm --no-build-isolation

Windows

Thank you, ShatteredX, for providing working instructions for a Windows installation.

git clone https://github.com/jefferythewind/warpgbm.git
cd warpgbm
python setup.py bdist_wheel
pip install .\dist\warpgbm-0.1.15-cp310-cp310-win_amd64.whl

Before either method, make sure you’ve installed PyTorch with GPU support:
https://pytorch.org/get-started/locally/

Example

import numpy as np
from sklearn.datasets import make_regression
from time import time
import lightgbm as lgb
from warpgbm import WarpGBM

# Create synthetic regression dataset
X, y = make_regression(n_samples=100_000, n_features=500, noise=0.1, random_state=42)
X = X.astype(np.float32)
y = y.astype(np.float32)

# Train LightGBM
start = time()
lgb_model = lgb.LGBMRegressor(max_depth=5, n_estimators=100, learning_rate=0.01, max_bin=7)
lgb_model.fit(X, y)
lgb_time = time() - start
lgb_preds = lgb_model.predict(X)

# Train WarpGBM
start = time()
wgbm_model = WarpGBM(max_depth=5, n_estimators=100, learning_rate=0.01, num_bins=7)
wgbm_model.fit(X, y)
wgbm_time = time() - start
wgbm_preds = wgbm_model.predict(X)

# Results
print(f"LightGBM:   corr = {np.corrcoef(lgb_preds, y)[0,1]:.4f}, time = {lgb_time:.2f}s")
print(f"WarpGBM:     corr = {np.corrcoef(wgbm_preds, y)[0,1]:.4f}, time = {wgbm_time:.2f}s")

Results (Ryzen 9 CPU, NVIDIA 3090 GPU):

LightGBM:   corr = 0.8742, time = 37.33s
WarpGBM:     corr = 0.8621, time = 5.40s

Pre-binned Data Example (Numerai)

WarpGBM can save additional training time if your dataset is already pre-binned. The Numerai tournament data is a great example:

import pandas as pd
from numerapi import NumerAPI
from time import time
import lightgbm as lgb
from warpgbm import WarpGBM
import numpy as np

napi = NumerAPI()
napi.download_dataset('v5.0/train.parquet', 'train.parquet')
train = pd.read_parquet('train.parquet')

feature_set = [f for f in train.columns if 'feature' in f]
target = 'target_cyrus'

X_np = train[feature_set].astype('int8').values
Y_np = train[target].values

# LightGBM
start = time()
lgb_model = lgb.LGBMRegressor(max_depth=5, n_estimators=100, learning_rate=0.01, max_bin=7)
lgb_model.fit(X_np, Y_np)
lgb_time = time() - start
lgb_preds = lgb_model.predict(X_np)

# WarpGBM
start = time()
wgbm_model = WarpGBM(max_depth=5, n_estimators=100, learning_rate=0.01, num_bins=7)
wgbm_model.fit(X_np, Y_np)
wgbm_time = time() - start
wgbm_preds = wgbm_model.predict(X_np)

# Results
print(f"LightGBM:   corr = {np.corrcoef(lgb_preds, Y_np)[0,1]:.4f}, time = {lgb_time:.2f}s")
print(f"WarpGBM:     corr = {np.corrcoef(wgbm_preds, Y_np)[0,1]:.4f}, time = {wgbm_time:.2f}s")

Results (Google Colab Pro, A100 GPU):

LightGBM:   corr = 0.0703, time = 643.88s
WarpGBM:     corr = 0.0660, time = 49.16s

Run it live in Colab

You can try WarpGBM in a live Colab notebook using real pre-binned Numerai tournament data:

Open in Colab

No installation required — just press "Open in Playground", then Run All!

Documentation

`WarpGBM` Parameters:

num_bins: Number of histogram bins to use (default: 10)
max_depth: Maximum depth of trees (default: 3)
learning_rate: Shrinkage rate applied to leaf outputs (default: 0.1)
n_estimators: Number of boosting iterations (default: 100)
min_child_weight: Minimum sum of instance weight needed in a child (default: 20)
min_split_gain: Minimum loss reduction required to make a further partition (default: 0.0)
histogram_computer: Choice of histogram kernel ('hist1', 'hist2', 'hist3') (default: 'hist3')
threads_per_block: CUDA threads per block (default: 32)
rows_per_thread: Number of training rows processed per thread (default: 4)
L2_reg: L2 regularizer (default: 1e-6)

Methods:

.fit(X, y, era_id=None): Train the model. X can be raw floats or pre-binned int8 data. era_id is optional and used internally.
.predict(X, chunksize=50_000): Predict on new raw float or pre-binned data.
.predict_numpy(X, chunksize=50_000): Same as .predict(X) but without using the GPU.

Acknowledgements

WarpGBM builds on the shoulders of PyTorch, scikit-learn, LightGBM, and the CUDA ecosystem. Thanks to all contributors in the GBDT research and engineering space.

Project details

Release history Release notifications | RSS feed

2.2.0

Oct 14, 2025

2.1.1

Oct 12, 2025

2.0.0

Oct 10, 2025

1.0.0

May 28, 2025

0.1.27

May 22, 2025

0.1.26

May 6, 2025

0.1.25

May 5, 2025

0.1.24

May 5, 2025

0.1.23

Apr 30, 2025

0.1.22

Apr 29, 2025

0.1.21

Apr 25, 2025

0.1.20

Apr 24, 2025

0.1.19

Apr 24, 2025

0.1.18

Apr 18, 2025

0.1.17

Apr 17, 2025

This version

0.1.16

Apr 16, 2025

0.1.15

Apr 15, 2025

0.1.14

Apr 14, 2025

0.1.13

Apr 14, 2025

0.1.12

Apr 14, 2025

0.1.11

Apr 14, 2025

0.1.10

Apr 14, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

warpgbm-0.1.16.tar.gz (52.6 kB view details)

Uploaded Apr 16, 2025 Source

File details

Details for the file warpgbm-0.1.16.tar.gz.

File metadata

Download URL: warpgbm-0.1.16.tar.gz
Upload date: Apr 16, 2025
Size: 52.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for warpgbm-0.1.16.tar.gz
Algorithm	Hash digest
SHA256	`6770a997a0d4f9c86de9165ac2da8f7b17d8281e838f22491dae7f0ecbefa19b`
MD5	`f93670cdc2803f71f2b1347359d8f161`
BLAKE2b-256	`d597364c891a8613642ceb96712fe55083d5ed01b8f1d33c6767334225977db1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for warpgbm-0.1.16.tar.gz:

Publisher: release.yml on jefferythewind/warpgbm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: warpgbm-0.1.16.tar.gz
- Subject digest: 6770a997a0d4f9c86de9165ac2da8f7b17d8281e838f22491dae7f0ecbefa19b
- Sigstore transparency entry: 198133250
- Sigstore integration time: Apr 16, 2025
Source repository:
- Permalink: jefferythewind/warpgbm@a39c7bf9a8ceef64cdfb6ca3f00f7a7806c49300
- Branch / Tag: refs/tags/v0.1.16
- Owner: https://github.com/jefferythewind
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@a39c7bf9a8ceef64cdfb6ca3f00f7a7806c49300
- Trigger Event: push

warpgbm 0.1.16

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

WarpGBM

Features

Performance Note

Installation

Recommended (GitHub, always latest):

Alternatively (PyPI, stable releases):

Windows

Example

Pre-binned Data Example (Numerai)

Run it live in Colab

Documentation

`WarpGBM` Parameters:

Methods:

Acknowledgements

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes

Provenance

warpgbm 0.1.16

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

WarpGBM

Features

Performance Note

Installation

Recommended (GitHub, always latest):

Alternatively (PyPI, stable releases):

Windows

Example

Pre-binned Data Example (Numerai)

Run it live in Colab

Documentation

WarpGBM Parameters:

Methods:

Acknowledgements

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes

Provenance

`WarpGBM` Parameters: