Skip to main content

Simulation based calibration and generation of synthetic data.

Project description

Simulation Based Calibration

A PyMC and Bambi implementation of the algorithms from:

Sean Talts, Michael Betancourt, Daniel Simpson, Aki Vehtari, Andrew Gelman: “Validating Bayesian Inference Algorithms with Simulation-Based Calibration”, 2018; arXiv:1804.06788

Many thanks to the authors for providing open, reproducible code and implementations in rstan and PyStan (link).

Installation

May be pip installed from github:

pip install simuk

Quickstart

  1. Define a PyMC or Bambi model. For example, the centered eight schools model:

    import numpy as np
    import pymc as pm
    
    data = np.array([28.0, 8.0, -3.0, 7.0, -1.0, 1.0, 18.0, 12.0])
    sigma = np.array([15.0, 10.0, 16.0, 11.0, 9.0, 11.0, 10.0, 18.0])
    
    with pm.Model() as centered_eight:
        mu = pm.Normal('mu', mu=0, sigma=5)
        tau = pm.HalfCauchy('tau', beta=5)
        theta = pm.Normal('theta', mu=mu, sigma=tau, shape=8)
        y_obs = pm.Normal('y', mu=theta, sigma=sigma, observed=data)
    
  2. Pass the model to the SBC class, and run the simulations. This will take a while, as it is running the model many times.

    sbc = SBC(centered_eight,
            num_simulations=100, # ideally this should be higher, like 1000
            sample_kwargs={'draws': 25, 'tune': 50})
    
    sbc.run_simulations()
    
    79%|███████▉  | 79/100 [05:36<01:29,  4.27s/it]
    
  3. Plot the empirical CDF for the difference between prior and posterior. The lines should be close to uniform and within the oval envelope.

    sbc.plot_results()
    

Simulation based calibration plots, ecdf

What is going on here?

The paper on the arXiv is very well written, and explains the algorithm quite well.

Morally, the example below is exactly what this library does, but it generalizes to more complicated models:

with pm.Model() as model:
    x = pm.Normal('x')
    pm.Normal('y', mu=x, observed=y)

Then what this library does is compute

with my_model():
    prior_samples = pm.sample_prior_predictive(num_trials)

simulations = {'x': []}
for idx in range(num_trials):
    y_tilde = prior_samples['y'][idx]
    x_tilde = prior_samples['x'][idx]
    with model(y=y_tilde):
        idata = pm.sample()
    simulations['x'].append((idata.posterior['x'] < x_tilde).sum())

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

simuk-0.2.0.tar.gz (6.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

simuk-0.2.0-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file simuk-0.2.0.tar.gz.

File metadata

  • Download URL: simuk-0.2.0.tar.gz
  • Upload date:
  • Size: 6.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for simuk-0.2.0.tar.gz
Algorithm Hash digest
SHA256 2ba540cbc2224baab3e86dccfee6656dcab65675d62ac4f51ce2c0ebcd27f443
MD5 ddf51b88dec26f002984294890a49806
BLAKE2b-256 50138d677c3155a0b9a318dfec2b481745444b912a5c55f8bf26764673d66b05

See more details on using hashes here.

Provenance

The following attestation bundles were made for simuk-0.2.0.tar.gz:

Publisher: publish-to-test-pypi.yml on arviz-devs/simuk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file simuk-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: simuk-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for simuk-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4e30b935fb95a256772c192d9997b990eb2d3cd5f2f4898719ba4423143b39ac
MD5 e052edf6f58a9fb3b766187534dec64b
BLAKE2b-256 4680af5fb99020bf5e315ea365cfb53a4f191a6fc730ea9bfe56839d22819d8e

See more details on using hashes here.

Provenance

The following attestation bundles were made for simuk-0.2.0-py3-none-any.whl:

Publisher: publish-to-test-pypi.yml on arviz-devs/simuk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page