Skip to main content

Implements Wide Boosting functions for popular boosting packages

Project description

wideboost

Implements wide boosting using popular boosting frameworks as a backend.

Getting started

pip install wideboost

Sample scripts

XGBoost back-end

import xgboost as xgb
from wideboost.wrappers import wxgb
from pydataset import data
import numpy as np

########
## Get and format the data
DAT = np.asarray(data('Yogurt'))
X = DAT[:,0:9]
Y = np.zeros([X.shape[0],1])
Y[DAT[:,9] == 'dannon'] = 1
Y[DAT[:,9] == 'hiland'] = 2
Y[DAT[:,9] == 'weight'] = 3

n = X.shape[0]
np.random.seed(123)
train_idx = np.random.choice(np.arange(n),round(n*0.5),replace=False)
test_idx = np.setdiff1d(np.arange(n),train_idx)

dtrain = xgb.DMatrix(X[train_idx,:],label=Y[train_idx,:])
dtest = xgb.DMatrix(X[test_idx,:],label=Y[test_idx,:])
#########

#########
## Set parameters and run wide boosting

param = {'btype':'I',      ## wideboost param -- one of 'I', 'In', 'R', 'Rn'
         'extra_dims':10,  ## wideboost param -- integer >= 0
         'max_depth':8,
         'eta':0.1,
         'objective':'multi:softmax',
         'num_class':4,
         'eval_metric':['merror'] }

num_round = 100
watchlist = [(dtrain,'train'),(dtest,'test')]
wxgb_results = dict()
bst = wxgb.train(param, dtrain, num_round,watchlist,evals_result=wxgb_results)

LightGBM back-end

import lightgbm as lgb
from wideboost.wrappers import wlgb
from pydataset import data
import numpy as np

########
## Get and format the data
DAT = np.asarray(data('Yogurt'))
X = DAT[:,0:9]
Y = np.zeros([X.shape[0],1])
Y[DAT[:,9] == 'dannon'] = 1
Y[DAT[:,9] == 'hiland'] = 2
Y[DAT[:,9] == 'weight'] = 3

n = X.shape[0]
np.random.seed(123)
train_idx = np.random.choice(np.arange(n),round(n*0.5),replace=False)
test_idx = np.setdiff1d(np.arange(n),train_idx)

train_data = lgb.Dataset(X[train_idx,:],label=Y[train_idx,0])
test_data = lgb.Dataset(X[test_idx,:],label=Y[test_idx,0])
#########

#########
## Set parameters and run wide boosting

param = {'btype':'I',      ## wideboost param -- one of 'I', 'In', 'R', 'Rn'
         'extra_dims':10,  ## wideboost param -- integer >= 0
         'objective':'multiclass',
         'metric':'multi_error',
         'num_class':4,
         'learning_rate': 0.1
        }

wlgb_results = dict()
bst = wlgb.train(param, train_data, valid_sets=test_data, num_boost_round=100, evals_result=wlgb_results)

Explainers

As a way to interpret wideboost models, we connect to basic functionality from SHAP. Example here:

from wideboost.explainers.shap import WTreeExplainer
import shap

explainer = WTreeExplainer(bst)
shap_values = explainer.shap_values(data('Yogurt').iloc[0:1,0:9])

shap.initjs()
print(bst.predict(xgb.DMatrix(np.asarray(data('Yogurt'))[0:1,0:9])))
shap.force_plot(explainer.expected_value[3],shap_values[3][0,:],data('Yogurt').iloc[0,0:9])

wideboost-shap

Parameter Explanations

'btype' indicates how to initialize the beta matrix. Settings are 'I', 'In', 'R', 'Rn'.

'extra_dims' integer indicating how many "wide" dimensions are used. When 'extra_dims' is set to 0 (and 'btype' is set to 'I') then wide boosting is equivalent to standard gradient boosting.

For more details see the documentation.

Documentation

https://mthorrell.github.io/wideboost/

Reference

https://arxiv.org/pdf/2007.09855.pdf

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wideboost-0.3.2.tar.gz (10.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wideboost-0.3.2-py3-none-any.whl (16.1 kB view details)

Uploaded Python 3

File details

Details for the file wideboost-0.3.2.tar.gz.

File metadata

  • Download URL: wideboost-0.3.2.tar.gz
  • Upload date:
  • Size: 10.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.9.0

File hashes

Hashes for wideboost-0.3.2.tar.gz
Algorithm Hash digest
SHA256 392991265afaa0fe4e7de60d3bca8d987c7948e9363425ffc55fb3f72f8aa9a7
MD5 3c60d1dbe01e5f5d9e100937b64349ad
BLAKE2b-256 b9d59387729d73dd2b2cb39155b92d518be64ff1a16c53712601484c90905d47

See more details on using hashes here.

File details

Details for the file wideboost-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: wideboost-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 16.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.9.0

File hashes

Hashes for wideboost-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a92a236324896b67fd9026cd4cef3bfabbd6abcde288bfc90686c23da83e9e8e
MD5 de5a29660bbffbae4e7444261f8511ba
BLAKE2b-256 23aed6484ffbbcf7fb737e8ec471222884f90aa5e2f22beaabf7a1c914b89f3b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page