Skip to main content

Smart cache for Stan models and runs

Project description

Quicker model iterations and enhanced productivity for Stan MCMC by

  • caching model compilation in a smart way

  • caching sampling results in a smart way

No waiting for the resampling the same model with the same data.

Install

First install CmdStanPy and CmdStan and make sure it works.

$ pip install cmdstancache

Usage

model = """
data {
  int N;
}
parameters {
  real<lower=-10.0, upper=10.0> x[N];
}
model {
  for (i in 1:N-1) {
         target += -2 * (100 * square(x[i+1] - square(x[i])) + square(1 - x[i]));
  }
}
"""
data = dict(N=2)

import cmdstancache

stan_variables, method_variables = cmdstancache.run_stan(
        model,
        data=data,
        # any other sample() parameters go here
        seed=42
)

Now comes the trick:

  • If you run this code twice, the second time the stored result is read.

  • If you add or modify a code comment, the same result is returned without having to rerun.

https://coveralls.io/repos/github/JohannesBuchner/CmdStanCache/badge.svg?branch=main https://github.com/JohannesBuchner/CmdStanCache/actions/workflows/testing.yml/badge.svg https://img.shields.io/pypi/v/cmdstancache.svg

How it works

cmdstancache keeps a cache of code and data that has previously been used for MCMC sampling. If it already has the results, it returns it from the cache.

Here are the details:

  1. The code is normalised (stripped of comments and indents)

  2. A hash of the normalised code is computed

  3. The model code is stored in ~/.stan_cache/<codehash>.stan

  4. The model is compiled, if it is not already there

  5. The data are sorted by key, exported to json, and a hash computed

  6. The data are stored in ~/.stan_cache/<datahash>.json

  7. cmdstanpy MCMC is run with code=<codehash>.stan and data=<datahash>.json

  8. fit.stan_variables() and fit.method_variables() are returned

  9. joblib memoizes steps 7 and 8, avoiding resampling when the same data and code hash are seen.

Plotting

Make a quick corner plots of only the scalar model variables:

cmdstancache.plot_corner(stan_variables)

In case some chains are stuck, and you want to remove their samples for plotting:

cleaned_variables = remove_stuck_chains(stan_variables, method_variables)
plot = plot_corner(cleaned_variables)

Since this is optional, the dependency of corner is pulled in if installed with:

$ pip install cmdstancache[plot]

Contributors

  • @JohannesBuchner

Contributions are welcome.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cmdstancache-1.2.2.tar.gz (18.1 kB view details)

Uploaded Source

File details

Details for the file cmdstancache-1.2.2.tar.gz.

File metadata

  • Download URL: cmdstancache-1.2.2.tar.gz
  • Upload date:
  • Size: 18.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.6

File hashes

Hashes for cmdstancache-1.2.2.tar.gz
Algorithm Hash digest
SHA256 0ec885f01df441b5f16b7ef1c14547b1f44a2e9c98cf94655b0563402dc0099b
MD5 e356e67d60673b98d3a4b3161a62b6a1
BLAKE2b-256 dd6b31630564a7ec16d6b3c53745a411a8755188fea9e6a385048bdd6bb2bfc2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page