Implements Wide Boosting functions for popular boosting packages

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

wideboost

Implements wide boosting using popular boosting frameworks as a backend.

Getting started

pip install wideboost

Sample scripts

XGBoost back-end

import xgboost as xgb
from wideboost.wrappers import wxgb
from pydataset import data
import numpy as np

########
## Get and format the data
DAT = np.asarray(data('Yogurt'))
X = DAT[:,0:9]
Y = np.zeros([X.shape[0],1])
Y[DAT[:,9] == 'dannon'] = 1
Y[DAT[:,9] == 'hiland'] = 2
Y[DAT[:,9] == 'weight'] = 3

n = X.shape[0]
np.random.seed(123)
train_idx = np.random.choice(np.arange(n),round(n*0.5),replace=False)
test_idx = np.setdiff1d(np.arange(n),train_idx)

dtrain = xgb.DMatrix(X[train_idx,:],label=Y[train_idx,:])
dtest = xgb.DMatrix(X[test_idx,:],label=Y[test_idx,:])
#########

#########
## Set parameters and run wide boosting

param = {'btype':'I',      ## wideboost param -- one of 'I', 'In', 'R', 'Rn'
         'extra_dims':10,  ## wideboost param -- integer >= 0
         'max_depth':8,
         'eta':0.1,
         'objective':'multi:softmax',
         'num_class':4,
         'eval_metric':['merror'] }

num_round = 100
watchlist = [(dtrain,'train'),(dtest,'test')]
wxgb_results = dict()
bst = wxgb.train(param, dtrain, num_round,watchlist,evals_result=wxgb_results)

LightGBM back-end

import lightgbm as lgb
from wideboost.wrappers import wlgb
from pydataset import data
import numpy as np

########
## Get and format the data
DAT = np.asarray(data('Yogurt'))
X = DAT[:,0:9]
Y = np.zeros([X.shape[0],1])
Y[DAT[:,9] == 'dannon'] = 1
Y[DAT[:,9] == 'hiland'] = 2
Y[DAT[:,9] == 'weight'] = 3

n = X.shape[0]
np.random.seed(123)
train_idx = np.random.choice(np.arange(n),round(n*0.5),replace=False)
test_idx = np.setdiff1d(np.arange(n),train_idx)

train_data = lgb.Dataset(X[train_idx,:],label=Y[train_idx,0])
test_data = lgb.Dataset(X[test_idx,:],label=Y[test_idx,0])
#########

#########
## Set parameters and run wide boosting

param = {'btype':'I',      ## wideboost param -- one of 'I', 'In', 'R', 'Rn'
         'extra_dims':10,  ## wideboost param -- integer >= 0
         'objective':'multiclass',
         'metric':'multi_error',
         'num_class':4,
         'learning_rate': 0.1
        }

wlgb_results = dict()
bst = wlgb.train(param, train_data, valid_sets=test_data, num_boost_round=100, evals_result=wlgb_results)

Explainers

As a way to interpret wideboost models, we connect to basic functionality from SHAP. Example here:

from wideboost.explainers.shap import WTreeExplainer
import shap

explainer = WTreeExplainer(bst)
shap_values = explainer.shap_values(data('Yogurt').iloc[0:1,0:9])

shap.initjs()
print(bst.predict(xgb.DMatrix(np.asarray(data('Yogurt'))[0:1,0:9])))
shap.force_plot(explainer.expected_value[3],shap_values[3][0,:],data('Yogurt').iloc[0,0:9])

wideboost-shap

Parameter Explanations

'btype' indicates how to initialize the beta matrix. Settings are 'I', 'In', 'R', 'Rn'.

'extra_dims' integer indicating how many "wide" dimensions are used. When 'extra_dims' is set to 0 (and 'btype' is set to 'I') then wide boosting is equivalent to standard gradient boosting.

Reference

https://arxiv.org/pdf/2007.09855.pdf

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.4.2

Nov 21, 2022

0.4.1

Nov 20, 2022

0.4.0

Nov 20, 2022

0.3.3

Nov 20, 2022

0.3.2

Oct 31, 2020

0.3.1

Sep 13, 2020

This version

0.3.0

Aug 29, 2020

0.2.0

Jul 26, 2020

0.1.3

Jul 18, 2020

0.1.1

Jul 18, 2020

0.0.1

Jul 18, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wideboost-0.3.0.tar.gz (9.4 kB view hashes)

Uploaded Aug 29, 2020 Source

Built Distribution

wideboost-0.3.0-py3-none-any.whl (14.0 kB view hashes)

Uploaded Aug 29, 2020 Python 3

Hashes for wideboost-0.3.0.tar.gz

Hashes for wideboost-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`80d9cf19fe8fcccd0f0c6ea559ca9346cf315e1da499122f054472a31b495d59`
MD5	`15a6a1e2f781ee8063d8586d1538f9f2`
BLAKE2b-256	`346ba3fc9465ab7a028eaf46de6bfab1fe378a575e8512c4b3765f846d68bd03`

Hashes for wideboost-0.3.0-py3-none-any.whl

Hashes for wideboost-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`64c74839d1e3370a558e24fcf1935d6075f52b8281ad0c01f0789cee74a4da20`
MD5	`186fd22126803750cb406fb3fa198f40`
BLAKE2b-256	`5e2824a6a5c5f681695a3dd6384c61ff768087d5c34efd3159ea499231fbd715`