A practitioner's toolbox for estimating large-scale Gaussian Process models with PyMC and PyTensor
Project description
PTGP
A Gaussian process library for building GP models that solve real-world problems.
Who this is for
PTGP is for practitioners who need flexible, well-supported GP modeling. The goal of PTGP is to be fully batteries-included and ready to work on real-world problems:
- Practical GP algorithms: exact GP, VFE with collapsed bound, SVGP with minibatch training, VFF (Variational Fourier Features)
- Full kernel library: ExpQuad, Matern52/32/12, RandomWalk, Gibbs, WarpedInput, categorical kernels for multi-class or categorical input variables, composition via
+and*,active_dimsfor dimension selection - Non-Gaussian likelihoods: Bernoulli, Poisson, NegativeBinomial, StudentT
- PyMC priors: set priors on any hyperparameter; use PyMC distributions for mean functions and noise models; MAP training by default
- Training tools: L-BFGS-B and Adam optimizers, per-parameter learning rates, staged optimization, frozen variables, inducing point initialization strategies, diagnostic-guided workflows; more are being added, such as carefully monitored training to help diagnose issues early
- Agent-readable docs:
docs/agents/ships LLM-readable guides for debugging training issues and folk wisdom (VFE training covered). See the Working with AI coding assistants section below. - More coming: see the issues
Researchers benefit from the underlying design: PTGP is built on PyTensor's symbolic graph and rewrite system, so you write GP math directly (pt.linalg.inv(K), pt.linalg.slogdet(K)) and the compiler chooses efficient algorithms based on declared matrix structure. This makes it straightforward to implement new GP approximations and create custom models, and will eventually allow matrix structure like Kronecker, Toeplitz, and sparse to be taken advantage of automatically.
Models
| Model | Scale | Best for |
|---|---|---|
gp.Unapproximated |
N < ~2,000 | Exact inference, model comparison |
gp.VFE |
N < ~50,000 | Medium-scale data with inducing points |
gp.SVGP |
N up to ~500,000 | Large data, non-Gaussian likelihoods, minibatch training |
FourierFeatures1D |
1D Matern kernels | Structured Kuu via Fourier basis; no inducing point placement |
Quick start
import numpy as np
import pymc as pm
import pytensor.tensor as pt
import ptgp as pg
X = np.random.randn(200, 1)
y = np.sin(X.ravel()) + 0.1 * np.random.randn(200)
Z_init = np.linspace(-2, 2, 20)[:, None]
Z_var = pt.matrix("Z", shape=(20, 1))
with pm.Model() as model:
ls = pm.InverseGamma("ls", alpha=2.0, beta=1.0)
eta = pm.Exponential("eta", lam=1.0)
kernel = eta**2 * pg.kernels.Matern52(input_dim=1, ls=ls)
svgp = pg.gp.SVGP(
kernel=kernel,
likelihood=pg.likelihoods.Gaussian(sigma=0.1),
inducing_variable=pg.inducing.Points(Z_var, Z_init=Z_init),
variational_params=pg.gp.init_variational_params(M=20),
)
fit = pg.fit(svgp, X, y, method="L-BFGS-B")
mean, var = pg.predict(svgp, np.linspace(-3, 3, 100)[:, None], fit)
pg.fit picks a default objective from the gp type (Unapproximated → marginal_log_likelihood, VFE → collapsed_elbo, SVGP → elbo) and returns a FitResult that pg.predict consumes. For stochastic mini-batch training, staged VFE, or per-group learning rates, drop down to pg.optim.compile_training_step / pg.optim.compile_scipy_objective — see notebooks/demo.ipynb:
X_var = pt.matrix("X")
y_var = pt.vector("y")
step, shared_params, shared_extras = pg.optim.compile_training_step(
pg.objectives.elbo, svgp, X_var, y_var, model, learning_rate=1e-2
)
for i in range(500):
loss = step(X, y)
predict_fn = pg.optim.compile_predict(
svgp, pt.matrix("X_new"), model, shared_params, shared_extras=shared_extras
)
mean, var = predict_fn(np.linspace(-3, 3, 100)[:, None])
Training uses MAP by default: the PyMC log-prior is added to the objective. Pass include_prior=False for pure ELBO. For exact GPs and VFE, use compile_scipy_objective with L-BFGS-B instead. See notebooks/demo.ipynb for end-to-end examples covering all three models.
How it works
PTGP is built on PyTensor's symbolic graph. Kernels, likelihoods, and GP models return symbolic tensors with naive linear algebra like pt.linalg.inv(K) that PyTensor's rewrite system automatically lowers to efficient Cholesky-based code using declared matrix properties. All models compile their full forward+gradient step down to the minimum number of cubic factorizations.
PTGP tries to distill some of the approaches of existing GP libraries and make them more accessible, mainly GPJax, GPflow, and GPyTorch.
Working with AI coding assistants
PTGP is set up to work nicely with AI coding assistants:
AGENTS.md— project-level instructions for AI coding assistants (architecture, conventions, where things live, how to run tests). Follows the AGENTS.md cross-tool convention used by Codex, Cursor, Aider, and others.docs/agents/— backend-agnostic agent-skill docs covering folk wisdom and training-debug recipes. Currently includesptgp-vfe(VFE diagnostic skill: pitfalls, escalation workflow, interpretation ofVFEDiagnosticsandGreedyVarianceDiagnostics).
Claude Code users
Claude Code reads CLAUDE.md, not AGENTS.md. Symlink so they stay in sync:
ln -s AGENTS.md CLAUDE.md
To install the VFE skill into a Claude Code skill directory (so Claude auto-discovers it when you mention VFE), run:
python scripts/install_claude_skills.py --project . # ./.claude/skills/
python scripts/install_claude_skills.py --user # ~/.claude/skills/
Install
pip install git+https://github.com/pymc-devs/ptgp.git
To hack on PTGP itself, clone and install in editable mode:
git clone https://github.com/pymc-devs/ptgp.git
cd ptgp
pip install -e .
Contributing
See the issues for what's being worked on. Feel free to propose issues, feature requests, or use cases you've been hoping could be made easier. PRs always welcome.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ptgp-0.1.0.tar.gz.
File metadata
- Download URL: ptgp-0.1.0.tar.gz
- Upload date:
- Size: 140.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6e813bf1e5d6781e50f8d328981279591438b772f5c20d4b1ee55b038356e504
|
|
| MD5 |
c5def6e8862fc61add4547b4c87ab152
|
|
| BLAKE2b-256 |
03ad04c3891fbde91b6f6fa8f299b88269a487e6c20eb11bf3d8770594c30ffb
|
Provenance
The following attestation bundles were made for ptgp-0.1.0.tar.gz:
Publisher:
pypi.yml on pymc-devs/ptgp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ptgp-0.1.0.tar.gz -
Subject digest:
6e813bf1e5d6781e50f8d328981279591438b772f5c20d4b1ee55b038356e504 - Sigstore transparency entry: 1673028717
- Sigstore integration time:
-
Permalink:
pymc-devs/ptgp@04fec459e196905a2de8b1f57c6eece9d0674051 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/pymc-devs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@04fec459e196905a2de8b1f57c6eece9d0674051 -
Trigger Event:
release
-
Statement type:
File details
Details for the file ptgp-0.1.0-py3-none-any.whl.
File metadata
- Download URL: ptgp-0.1.0-py3-none-any.whl
- Upload date:
- Size: 71.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b107eda722d1f0d05f369eba87c090fadea9bd057894d8fd3424f362bec90e13
|
|
| MD5 |
4ea6f44cc9efbca58054b1dadb6f2a46
|
|
| BLAKE2b-256 |
8e135c4f4327f851ccb750c169aefdb1c01e7da06d11e3eab8732a3bdf92b730
|
Provenance
The following attestation bundles were made for ptgp-0.1.0-py3-none-any.whl:
Publisher:
pypi.yml on pymc-devs/ptgp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ptgp-0.1.0-py3-none-any.whl -
Subject digest:
b107eda722d1f0d05f369eba87c090fadea9bd057894d8fd3424f362bec90e13 - Sigstore transparency entry: 1673028741
- Sigstore integration time:
-
Permalink:
pymc-devs/ptgp@04fec459e196905a2de8b1f57c6eece9d0674051 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/pymc-devs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@04fec459e196905a2de8b1f57c6eece9d0674051 -
Trigger Event:
release
-
Statement type: