Skip to main content

python interface to openbt

Project description

pyopenbt

This Python package is the Python interface for Dr. Matthew Pratola's OpenBT project. Currently, its only module is openbt, which contains the OPENBT class. This class allows the user to create fit objects in a scikit-learn style.

Build PyPI version Anaconda-Server Badge

About:

OpenBT is a flexible and extensible C++ framework for implementing Bayesian regression tree models. Currently a number of models and inference tools are available for use in the released code with additional models/tools under development. The code makes use of MPI for parallel computing. Apart from this package, an R interface is provided via the ROpenbt package to demonstrate use of the software.

How to utilize this package (and its module and class):

  1. Install the package from the command line by typing:
    $ python -m pip install pyopenbt.
  2. In Python3 (or a Python script), import the OPENBT class from the openbt module by typing:
    from pyopenbt.openbt import OPENBT.
    This gives Python access to the OPENBT class. Typing
    from pyopenbt.openbt import *
    or
    from pyopenbt import openbt
    would also work, but for the former, the obt_load() function is loaded unnecesarily (unless you wish to use that function, of course). For the latter, the class would be referred to as pyopenbt.OPENBT, not simply OPENBT.
  3. To utilize the OPENBT class/functions in Python 3 to conduct and interpret fits: create a fit object such as
    m = OPENBT(model = "bart", ...).
    The fit object is an instance of the class. Here's an example of running a functions from the class:
    fitp = m.predict(preds)
  4. See example scripts (in the "examples" folder), showing the usage of the OPENBT class on data, to this package.

Example:

To start, let's create a test function. A popular one is the Branin function:

# Test Branin function, rescaled
def braninsc (xx):
    x1 = xx[0]
    x2 = xx[1]
    
    x1bar = 15 * x1 - 5
    x2bar = 15 * x2
    
    import math
    term1 = x2bar - 5.1*x1bar**2/(4*math.pi**2) + 5*x1bar/math.pi - 6
    term2 = (10 - 10/(8*math.pi)) * math.cos(x1bar)
    
    y = (term1**2 + term2 - 44.81) / 51.95
    return(y)


# Simulate branin data for testing
import numpy as np
np.random.seed(99)
n = 500
p = 2
x = np.random.uniform(size=n*p).reshape(n,p)
y = np.zeros(n)
for i in range(n):
    y[i] = braninsc(x[i,])

Note that the x and y data is a numpy array - this is the intended format. Now we can load the openbt package and fit a BART model. Here we set the model type as model="bart" which ensures we fit a homoscedastic BART model. The number of MPI threads to use is specified as tc=4. For a list of all optional parameters, see m._dict__ (after creating m) or help(OPENBT).

from pyopenbt.openbt import OPENBT, obt_load
m = OPENBT(model = "bart", tc = 4, modelname = "branin")
fit = m.fit(x, y)

Next we can construct predictions and make a simple plot comparing our predictions to the training data. Here, we are calculating the in-sample predictions since we passed the same x array to the predict() function.

# Calculate in-sample predictions
fitp = m.predict(x, tc = 4)

# Make a simple plot
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(16,9)); ax = fig.add_subplot(111)
ax.plot(y, fitp['mmean'], 'ro')
ax.set_xlabel("Observed"); ax.set_ylabel("Fitted")
ax.axline([0, 0], [1, 1])

To save the model, use OPENBT's obt_save() function. Similarly, load the model using obt_load(). Because the posterior can be large in sample-based models such as these, the fitted model is saved in a compressed file format with the extension .obt. Additionally, the estimator object can be saved and loaded (see below).

#--------------------------------------------------------------------------------------------
# Save fitted MODEL object (not the estimator object, m) as test.obt in the working directory
m.obt_save(fit, "test", est = False)
# Load fitted model object (AKA fit object) to a new object
fit2 = obt_load("test", est = False)

# We can also save/load the fit ESTIMATOR object by specifying est = True in obt_save()/load().
# The estimator object has all our settings and properties, but not fit results. 
# This is similar to scikit-learn saving/loading its estimators.
m.obt_save("test_fit_est", est = True)
m2 = obt_load("test_fit_est", est = True)
#--------------------------------------------------------------------------------------------

The standard variable activity information, calculated as the proportion of splitting rules involving each variable, can be computed using OPENBT's vartivity() function.

# Calculate variable activity information
fitv = m.vartivity()
print(fitv['mvdraws'])

A more accurate alternative is to calculate the Sobol indices.

# Calculate Sobol indices
fits = m.sobol(cmdopt = 'MPI', tc = 4)
print(fits['msi'])
print(fits['mtsi'])
print(fits['msij'])

Again, for more examples of using OpenBT, explore the examples folder in the Github repo .

See Also:

Github "Homepage" for this package
PyPI Package Home

Contributions

All contributions are welcome. You can help this project be better by reporting issues, bugs, or forking the repo and creating a pull request.


License

The package is licensed under the BSD 3-Clause License. A copy of the license can be found along with the code.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyopenbt-0.0.7.tar.gz (27.1 kB view details)

Uploaded Source

Built Distribution

pyopenbt-0.0.7-py3-none-any.whl (24.4 kB view details)

Uploaded Python 3

File details

Details for the file pyopenbt-0.0.7.tar.gz.

File metadata

  • Download URL: pyopenbt-0.0.7.tar.gz
  • Upload date:
  • Size: 27.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for pyopenbt-0.0.7.tar.gz
Algorithm Hash digest
SHA256 b2ae05b4bc0d48aefd4a18e2c15999bb00b359732f6571e8d83d92601a2af3a8
MD5 2168cd4c76d2ac295bd4de3e41c58737
BLAKE2b-256 f5901b0c213d5df6e31caf0c743f5147cd0f87cb37486a02d47263e4aec292f4

See more details on using hashes here.

File details

Details for the file pyopenbt-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: pyopenbt-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 24.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for pyopenbt-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 33d52eaf3bf0618c12d178e3ce97ef05dcfc476e1ccc8cf63310c1995f016c8b
MD5 0e4e00b3a15e3a52721159c4dc3cb7d2
BLAKE2b-256 9f6a1dc839da386de37edbb5a382a4c0cbeaface5f62d999f2e939976df45e00

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page