Probabilistic predictions for tabular data, using diffusion models and decision trees.
Project description
Treeffuser is an easy-to-use package for probabilistic prediction on tabular data with tree-based diffusion models. Its goal is to estimate distributions of the form p(y|x) where x is a feature vector, y is a target vector and the form of p(y|x) can be arbitrarily complex (e.g multimodal, heteroskedastic, non-gaussian, heavy-tailed, etc).
It is designed to adhere closely to the scikit-learn API and requires minimal user tuning.
Usage Example
Here’s how you can use Treeffuser in your project:
from treeffuser import Treeffuser
import numpy as np
# (n_training, n_features), (n_training, n_targets)
X, y = ... # load your data
# (n_test, n_features)
X_test = ... # load your test data
# Estimate p(y|x) with a tree-based diffusion model
model = Treeffuser()
model.fit(X, y)
# Draw samples y ~ p(y|x) for each test point
# (n_samples, n_test, n_targets)
y_samples = model.sample(X_test, n_samples=1000)
# Compute downstream metrics
mean = np.mean(y_samples, axis=0)
std = np.std(y_samples, axis=0)
median = np.median(y_samples, axis=0)
quantile = np.quantile(y_samples, q=0 axis=0)
... # other metrics
Please refer to the docstrings for more information on the available methods and parameters.
Installation
You can install Treeffuser via pip from PyPI with the following command:
pip install treeffuser
You can also install the in-development version with:
pip install git+https://github.com/blei-lab/tree-diffuser.git@main
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for treeffuser-0.1.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 288b2700ff7cf3c4cab60708cb151b3662e08f66f2e78be867d9839d263cd02f |
|
MD5 | 9262cc0d563ef9b1ae5e352b855bf453 |
|
BLAKE2b-256 | 2307075a2f93b906fc43be5a7164718e23ed8c634861bb4dda0935881bd64f9e |