Bayesian causal inference for zero-inflated outcomes — GPU-accelerated joint hurdle BCF with SBC calibration
Project description
pytyche
GPU-accelerated Bayesian causal forests for zero-inflated outcomes — and an adaptive, round-based experiment loop built on top of them.
pytyche does two things. First, it ships some of the fastest Bayesian Causal Forest (BCF) estimators available: continuous and binary effects run on the GPU via bartz, and hurdle outcomes (revenue, spend, and other "mostly-zero, sometimes-positive" metrics) run on pytyche's own GPU kernel. Any of these can be used standalone for a single fit — give it data, get back a calibrated posterior over heterogeneous treatment effects. Second, it wraps those estimators in a round-based adaptive experiment loop that allocates the next round's traffic toward the segments that respond, while keeping controls everywhere so measurement stays honest. The whole loop runs on a single GPU.
The speed is what makes the rest practical: BCF intervals at production scale need empirical recalibration (simulation-based calibration across realistic data), which means hundreds of full posterior fits. On a GPU that is an overnight job instead of a CPU-week, so calibration becomes something you do per-deployment rather than per-publication.
Install
# Recommended — GPU JAX (CUDA 12, Linux)
uv add 'pytyche[gpu]' # or: pip install 'pytyche[gpu]'
# CPU-only (fully functional; the first fit warns once if no GPU is found)
uv add pytyche # or: pip install pytyche
Check the runtime with python -c "import pytyche as pt; pt.check_setup()".
Quick start
Fit the canonical hurdle model on an 800-visitor synthetic dataset in about 20 seconds on JAX-CPU:
import os; os.environ["JAX_PLATFORMS"] = "cpu" # omit for GPU
import pytyche as pt
bundle = pt.generate(n_visitors=800, segments={
"responders": {"pct": 0.4, "base_conv": 0.08, "treatment_effect": 0.10,
"aov_mu": 3.5, "aov_sigma": 0.5, "treatment_aov_mu_shift": 0.15},
"non_responders": {"pct": 0.6, "base_conv": 0.06, "treatment_effect": 0.0,
"aov_mu": 3.3, "aov_sigma": 0.5, "treatment_aov_mu_shift": 0.0},
}, metric="revenue_per_visitor", seed=0)
result = pt.fit(bundle.observed, num_burnin=40, num_mcmc=80, num_trees_mu=30,
num_trees_tau=15, max_depth=4, num_gfr_sweeps=2,
diagnostic_interval=20, seed=0)
result.analyze() # treatment comparisons, discovered segments, recommendation
# result.rpv_cate_samples → (n_visitors, 80) posterior draws of the per-visitor effect
To run a full multi-round experiment instead of a single fit,
pt.sequential_experiment(...) drives the adaptive loop end to end — a
realistically-sized run (350,000 visitors) takes about fifteen minutes on a
consumer GPU.
Highlights
- GPU hurdle BCF. Two coupled forests — probit conversion and log-severity — share a single tree topology (following Linero et al.'s shared Bayesian forests), so the structure carries information across both channels and stabilizes per-segment effects at the low conversion rates online experiments actually live at. Roughly 4.5–8.6× faster than the StochTree CPU backend at n=750k; single-channel continuous/binary fits hit 17–63× from n=250k to n=2M (benchmark grid).
- Calibrated intervals. BCF posteriors are narrow by construction; pytyche recalibrates them against simulation-based ground truth so the credible intervals you report are honest at your operating scale.
- Adaptive experiment loop.
pt.sequential_experimentruns Thompson allocation with guaranteed control retention and built-in power simulation. - Interpretable segments. Each round compresses the effect posterior into a shallow policy tree — a reviewable decision surface, not just a model.
- Synthetic data generators. A small typed grammar
(
pytyche.generators.scenarios) parameterizes the data-generating process for calibration sweeps and power analysis. - Honest-uncertainty contracts.
pytyche.contractsseparates observed data from ground truth at the type level, so analysis code cannot accidentally peek at what it shouldn't see.
Documentation
- Your first hurdle BCF fit — install to an interpretable posterior.
- The adaptive experiment — the full multi-round loop, end to end.
- Overview — what pytyche does, who it's for, and the design lineage.
- Full documentation — tutorials, how-to guides, concepts, and the API reference.
When to use it
pytyche is built for designed experiments: round-based online tests with a handful of treatments where assignment rules are explicit and propensities are recorded exactly. It also supports observational causal inference — BCF is purpose-built for confounded settings, taking propensity scores into the prior for strong point estimation. Two honest caveats there: pytyche expects propensity scores as an input (it has no built-in nuisance/propensity estimation or double-ML cross-fitting — that's the reason to reach for econml or DoubleML instead), and the library is shaped and validated around designed experiments, so observational use is supported but less tested. In all cases, treat intervals as needing calibration at your scale before you rely on them.
Out of scope: marketplaces and anything with cross-visitor interference (SUTVA violations), regulated contexts needing preregistration-grade governance, large-catalog per-item recommendation, and real-time / streaming adaptation. The full scope discussion is in the overview.
Contributing
Contributions are welcome — see CONTRIBUTING.md for the
development setup, branching model, and testing tiers.
License
MIT — see LICENSE. Built on
bartz (MIT) by Giacomo Petrillo; the
GPU BART kernels the continuous and binary paths fit on top of are bartz's. The
hurdle GPU kernel, shared-tree extensions, and the calibration / targeting /
generator stack are pytyche's.
Source: https://gitlab.com/tradcliffe2/tyche · PyPI: https://pypi.org/project/pytyche/ (the package is pytyche; the GitLab repo is tyche for URL brevity).
Citation
Methodology paper in preparation. Cite as pytyche, v0.2.1, https://gitlab.com/tradcliffe2/tyche until a citable DOI is up.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pytyche-0.2.1.tar.gz.
File metadata
- Download URL: pytyche-0.2.1.tar.gz
- Upload date:
- Size: 293.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
800f594499e0b672171770c0644bc6876f82b047e8ea16bbf3d3b482fed182fc
|
|
| MD5 |
c74373094a57668e7d79958cbe8b3232
|
|
| BLAKE2b-256 |
730e8da48346c49196ded7a67bee944bb29fa9d92b14d0df4b5d2ff8e4c028fd
|
File details
Details for the file pytyche-0.2.1-py3-none-any.whl.
File metadata
- Download URL: pytyche-0.2.1-py3-none-any.whl
- Upload date:
- Size: 306.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a52713b86ec824113135a7bca01f3c20d80931fd59af8b1b26198ebcf1ee2a7d
|
|
| MD5 |
c69a82b767914dd9437d96ac83b430d0
|
|
| BLAKE2b-256 |
73e22ffe1dd20e7aa25ec2b19583832f5270574d383f11f97ca1a9a8d698af14
|