Skip to main content

the Python interface for Dr. Matthew Pratola's OpenBT project - allows the user to perform BART (Bayesian Additive Regression Trees) fits and predictions on datasets.

Project description

Openbt

This Python package is the Python interface for Dr. Matthew Pratola's OpenBT project. Currently, its only module is openbt, which contains the OPENBT class. This class allows the user to create fit objects in a scikit-learn style.

About:

OpenBT is a flexible and extensible C++ framework for implementing Bayesian regression tree models. Currently a number of models and inference tools are available for use in the released code with additional models/tools under development. The code makes use of MPI for parallel computing. Apart from this package, an R interface is provided via the ROpenbt package to demonstrate use of the software.

How to utilize this package (and its module and class):

  1. Install the package from the command line by typing:
    $ python3 -m pip install openbt==[version number you want].
  2. In Python3 (or a Python script), import the OPENBT class from the openbt module by typing:
    from openbt.openbt import OPENBT.
    This gives Python access to the OPENBT class. Typing
    from openbt.openbt import *
    or
    from openbt import openbt
    would also work, but for the former, the load() function is loaded unnecesarily (unless you wish to use that function, of course). For the latter, the class would be referred to as openbt.OPENBT, not simply OPENBT.
  3. To utilize the OPENBT class/functions in Python 3 to conduct and interpret fits: create a fit object such as
    m = OPENBT(model = "bart", ...).
    The fit object is an instance of the class. Here's an example of running a functions from the class:
    fitp = m.predict(preds)
  4. I attempted to upload example scripts (in the "PyScripts" folder), showing the usage of the OPENBT class on data, to this package. However, if these are difficult to access, you can also simply view them at the github "Homepage" .

Example:

See branin_ex.py in the PyScripts segment of this package for the script version of this walkthrough (or, see multiple example datasets and fits on the Github Homepage).

To start, let's create a test function. A popular one is the Branin function:

# Test Branin function, rescaled
def braninsc (xx):
    x1 = xx[0]
    x2 = xx[1]
    
    x1bar = 15 * x1 - 5
    x2bar = 15 * x2
    
    import math
    term1 = x2bar - 5.1*x1bar**2/(4*math.pi**2) + 5*x1bar/math.pi - 6
    term2 = (10 - 10/(8*math.pi)) * math.cos(x1bar)
    
    y = (term1**2 + term2 - 44.81) / 51.95
    return(y)


# Simulate branin data for testing
import numpy as np
np.random.seed(99)
n = 500
p = 2
x = np.random.uniform(size=n*p).reshape(n,p)
y = np.zeros(n)
for i in range(n):
    y[i] = braninsc(x[i,])

And then we can load the openbt package and fit a BART model. Here we set the model type as model="bart" which ensures we fit a homoscedastic BART model. The number of MPI threads to use is specified as tc=4. For a list of all optional parameters, see m._dict__ (after creating m) or help(OPENBT).

from openbt.openbt import OPENBT, load
m = OPENBT(model = "bart", tc = 4, modelname = "branin")
fit = m.fit(x, y)

Next we can construct predictions and make a simple plot comparing our predictions to the training data. Here, we are calculating the in-sample predictions since we passed the same x array to the predict() function.

# Calculate in-sample predictions
fitp = m.predict(x, tc = 4)

# Make a simple plot
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(16,9)); ax = fig.add_subplot(111)
ax.plot(y, fitp['mmean'], 'ro')
ax.set_xlabel("Observed"); ax.set_ylabel("Fitted")
ax.axline([0, 0], [1, 1])

To save the model, use OPENBT's save() function. Similarly, load the model using load(). Because the posterior can be large in sample-based models such as these, the fitted model is saved in a compressed file format with the extension .obt. Additionally, the estimator object can be saved and loaded (see below).

#--------------------------------------------------------------------------------------------
# Save fitted MODEL object (not the estimator object, m) as test.obt in the working directory
m.save(fit, "test", est = False)
# Load fitted model object (AKA fit object) to a new object
fit2 = load("test", est = False)

# We can also save/load the fit ESTIMATOR object by specifying est = True in save()/load().
# The estimator object has all our settings and properties, but not fit results. 
# This is similar to scikit-learn saving/loading its estimators.
m.save("test_fit_est", est = True)
m2 = load("test_fit_est", est = True)
# If you wish, you can see that m2 (the loaded estimator object) can perform fits:
# fit3 = m2.fit(x, y)
# m2 can perform predictions, too:
# fitp2 = m2.predict(x, tc = 4)
#--------------------------------------------------------------------------------------------

The standard variable activity information, calculated as the proportion of splitting rules involving each variable, can be computed using OPENBT's vartivity() function.

# Calculate variable activity information
fitv = m.vartivity()
print(fitv['mvdraws'])

A more accurate alternative is to calculate the Sobol indices.

# Calculate Sobol indices
fits = m.sobol(cmdopt = 'MPI', tc = 4)
print(fits['msi'])
print(fits['mtsi'])
print(fits['msij'])

Again, for more examples of using OpenBT, explore the PyScripts folder in the Github repo .

See Also:

Github "Homepage" for this package
PyPI Package Home
Zoltan Puha's class (the current class was built as a modification to this)
ROpenBT Project Home

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openbt-0.0.3.tar.gz (26.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openbt-0.0.3-py3-none-any.whl (24.2 kB view details)

Uploaded Python 3

File details

Details for the file openbt-0.0.3.tar.gz.

File metadata

  • Download URL: openbt-0.0.3.tar.gz
  • Upload date:
  • Size: 26.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.8.10

File hashes

Hashes for openbt-0.0.3.tar.gz
Algorithm Hash digest
SHA256 1bd13a8e6be0b383926344091b5f603922ef7d7f772b3ce7090491f67f4642c9
MD5 088a9f91cf24cb8224f9b2370c5ed824
BLAKE2b-256 e70d2ab034afecc1c14a6d6502febee0b6996a78e33f72c22f5fb2e9f4630628

See more details on using hashes here.

File details

Details for the file openbt-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: openbt-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 24.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.8.10

File hashes

Hashes for openbt-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d5c6a4ed5219b8c691e495139c20ebbf365375bebd8965e50b7d5b0de4d5336a
MD5 a7a0984c54952d5c3d1ad807b1c4b05d
BLAKE2b-256 98d18413440e47c9a6590c319401f6fe04424b3be97062df02790cfd95f8d1a6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page