Skip to main content

Implements Wide Boosting functions for popular boosting packages

Project description

wideboost

Implements wide boosting using popular boosting frameworks as a backend.

Getting started

pip install wideboost

Sample scripts

XGBoost back-end

import xgboost as xgb
from wideboost.wrappers import wxgb
from pydataset import data
import numpy as np

########
## Get and format the data
DAT = np.asarray(data('Yogurt'))
X = DAT[:,0:9]
Y = np.zeros([X.shape[0],1])
Y[DAT[:,9] == 'dannon'] = 1
Y[DAT[:,9] == 'hiland'] = 2
Y[DAT[:,9] == 'weight'] = 3

n = X.shape[0]
np.random.seed(123)
train_idx = np.random.choice(np.arange(n),round(n*0.5),replace=False)
test_idx = np.setdiff1d(np.arange(n),train_idx)

dtrain = xgb.DMatrix(X[train_idx,:],label=Y[train_idx,:])
dtest = xgb.DMatrix(X[test_idx,:],label=Y[test_idx,:])
#########

#########
## Set parameters and run wide boosting

param = {'btype':'I',      ## wideboost param -- one of 'I', 'In', 'R', 'Rn'
         'extra_dims':10,  ## wideboost param -- integer >= 0
         'max_depth':8,
         'eta':0.1,
         'objective':'multi:softmax',
         'num_class':4,
         'eval_metric':['merror'] }

num_round = 100
watchlist = [(dtrain,'train'),(dtest,'test')]
wxgb_results = dict()
bst = wxgb.train(param, dtrain, num_round,watchlist,evals_result=wxgb_results)

LightGBM back-end

import lightgbm as lgb
from wideboost.wrappers import wlgb
from pydataset import data
import numpy as np

########
## Get and format the data
DAT = np.asarray(data('Yogurt'))
X = DAT[:,0:9]
Y = np.zeros([X.shape[0],1])
Y[DAT[:,9] == 'dannon'] = 1
Y[DAT[:,9] == 'hiland'] = 2
Y[DAT[:,9] == 'weight'] = 3

n = X.shape[0]
np.random.seed(123)
train_idx = np.random.choice(np.arange(n),round(n*0.5),replace=False)
test_idx = np.setdiff1d(np.arange(n),train_idx)

train_data = lgb.Dataset(X[train_idx,:],label=Y[train_idx,0])
test_data = lgb.Dataset(X[test_idx,:],label=Y[test_idx,0])
#########

#########
## Set parameters and run wide boosting

param = {'btype':'I',      ## wideboost param -- one of 'I', 'In', 'R', 'Rn'
         'extra_dims':10,  ## wideboost param -- integer >= 0
         'objective':'multiclass',
         'metric':'multi_error',
         'num_class':4,
         'learning_rate': 0.1
        }

wlgb_results = dict()
bst = wlgb.train(param, train_data, valid_sets=test_data, num_boost_round=100, evals_result=wlgb_results)

Explainers

As a way to interpret wideboost models, we connect to basic functionality from SHAP. Example here:

from wideboost.explainers.shap import WTreeExplainer
import shap

explainer = WTreeExplainer(bst)
shap_values = explainer.shap_values(data('Yogurt').iloc[0:1,0:9])

shap.initjs()
print(bst.predict(xgb.DMatrix(np.asarray(data('Yogurt'))[0:1,0:9])))
shap.force_plot(explainer.expected_value[3],shap_values[3][0,:],data('Yogurt').iloc[0,0:9])

wideboost-shap

Parameter Explanations

'btype' indicates how to initialize the beta matrix. Settings are 'I', 'In', 'R', 'Rn'.

'extra_dims' integer indicating how many "wide" dimensions are used. When 'extra_dims' is set to 0 (and 'btype' is set to 'I') then wide boosting is equivalent to standard gradient boosting.

For more details see the documentation.

Documentation

https://mthorrell.github.io/wideboost/

Reference

https://arxiv.org/pdf/2007.09855.pdf

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wideboost-0.3.1.tar.gz (10.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wideboost-0.3.1-py3-none-any.whl (16.1 kB view details)

Uploaded Python 3

File details

Details for the file wideboost-0.3.1.tar.gz.

File metadata

  • Download URL: wideboost-0.3.1.tar.gz
  • Upload date:
  • Size: 10.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.5

File hashes

Hashes for wideboost-0.3.1.tar.gz
Algorithm Hash digest
SHA256 cab614fd3f99d13370bb85ab9131fb07b3f9797eee760274da4710230dfafc4b
MD5 655464766ab213841ab9ac04c313abfa
BLAKE2b-256 b22fe1725852e0f1f7359e4620021f6aa79a3da895729085cdb7226c6564c591

See more details on using hashes here.

File details

Details for the file wideboost-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: wideboost-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 16.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.5

File hashes

Hashes for wideboost-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 21273bc3d5b42ccb50c66871b37084dbaafce5880cc25b72b13aa4311ce563d9
MD5 77590c93b75b1c947514e64ad4785a0c
BLAKE2b-256 a6a5b8a286a5d38c42218f5973cac850fad3887492340092716448602b1cacd1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page