Skip to main content

Hierarchical Bayesian ODE Models with Stan

Project description

Hierarchical Bayesian ODE Models using Stan

codecov

Python package to automatically generate Stan code for fitting hierarchical ODE models with Stan. The integration of the ODEs is distributed over multiple CPU cores using Stan's map_rect function.

The type of data HBOMS is tailored to is panels of timeseries from replicate experiments or observations where there is latent inter-unit variability in parameters. In the frequentist universe one would use a non-linear random effects (or mixed effects) model. Writing such models in Stan is not that difficult, but it is tedious and error prone, especially during model development and if one wants to test multiple model variants and assumptions about the parameters. HBOMS generates efficient, human readable, and correct Stan code from minimal user input. As models are constructed in python, Stan code can be generated programmatically, making it easier to compare a number of candicate models. HBOMS provides methods to fit the Stan models to data, and inspect the results, but the generated code can also and further developed independently from the HBOMS package.

Similar or related packages are brms, Monolix, Torsten, and Pumas.

Installation

Currently, HBOMS can not be installed from PyPI, but it can be installed with pip: First download the source code from the github repository. For instance with

$ git clone git@github.com:chvandorp/hboms.git

Then create a virtual environment with venv and activate

$ python3 -m venv myvenv
$ source myvenv/bin/activate

Here "myvenv" can be any name you want. Then install hboms with pip

(myvenv) $ pip install path/to/hboms

We can now import hboms in a python session in the virtual environment myvenv

Simple example

Consider the Lotka-Volterra model with an equation for prey $dx/dt = ax - bxy$, and predators $dy/dt = cxy - dy$. We would write this is Stan as

ddt_x = a*x - b*x*y;
ddt_y = c*x*y - d*y;

Of course, we would also have to declare state variables x en y, their time derivatives ddt_x and ddt_y and the parameters a, b, c and d. However, HBOMS will manage all these declarations for you. In python, we first import the hboms module and define a string containing the ODE model

import hboms

ode_model = """
ddt_x = a*x - b*x*y;
ddt_y = c*x*y - d*y;
"""

Additionally, we have to specify the initial condition. For this, we have to specify $x(0)$ and $y(0)$ in terms of the parameters. For state variables x and y, the initial values are names x_0 and y_0.

init = """
x_0 = x0;
y_0 = y0;
"""

where x0 and y0 are two more parameters that we have to extimate.

Speaking of parameters, we next have to specify the types of all parameters in the model

params = [
    hboms.Parameter("a", 0.8, "random"), # name, initial guess, type
    hboms.Parameter("b", 0.7, "random"),
    hboms.Parameter("c", 0.6, "random"),
    hboms.Parameter("d", 0.4, "random"),
    hboms.Parameter("x0", 1.0, "const"), # initial conditions
    hboms.Parameter("y0", 1.0, "const"),
    hboms.Parameter("sigma", 0.1, "fixed") # measurement error
]

the parameters $a, b, c$ and $d$ have random effects, i.e. the are unobserved random variables that vary between different replicates of the experiment. The parameter $\sigma$ determines the magnitude of the measurement error and is the same for all the experiments (this assumption can be relaxed). By default, parameters are restricted to be positive.

In addition to the parameters, we have to specify the state variables

state = [hboms.Variable("x"), hboms.Variable("y")]

and observations

obs = [hboms.Observation("X"), hboms.Observation("Y")]

We then have to specify distributions (of likelihood functions) to couple the observations to the model predictions. Here we assume that X and Y have a log-normal distribution with location parameters log(x) and log(y) and scale parameter sigma.

dists = [
    hboms.StanDist("lognormal", "X", params=["log(x)", "sigma"]),
    hboms.StanDist("lognormal", "Y", params=["log(y)", "sigma"])
]

The parameters for these distributions can be arbitrary Stan expressions. The first argument "lognormal" is the name of the log-normal distribution defined natively in Stan.

We then combine all components into a HbomsModel objects, which then "transpiles" the model into Stan code, which is transpiled into C++ by Stan and compiled into an executable.

hbm = hboms.HbomsModel(
    name = "lotka",
    odes = ode_model,
    init = init,
    params = params,
    state = state,
    obs = obs,
    dists = dists
)

To see what Stan code HBOMS generated, a utility is provided that works best in Jupyter notebooks:

hboms.utilities.show_stan_model(hbm.model_code)

This displays the complete Stan model with appropriate syntax highlighting.

An HBOMS model expects a data set given as a Python dictionary with keys "Time" for time points per unit, and keys for all the observations. In this case we have to provide keys "X" and "Y". As an example, we could have

example_data = {
    "Time" : [
        [1, 2, 3, 4, 5],   # unit 1 has 5 observations
        [1, 2.5, 3, 4],    # unit 2 has 4 observations
        [1, 3, 4, 5, 6, 7] # unit 3 has 6 observations
    ],
    "X" : [
        [0.4, 0.3, 0.1, 0.2, 0.5], # X- data for unit 1
        [0.4, 0.2, 0.05, 0.3],
        [0.35, 0.12, 0.25, 0.55, 0.3, 0.1]
    ],
    "Y" : [
        [0.1, 0.45, 0.33, 0.09, 0.24], # Y-data for unit 1
        [0.03, 0.41, 0.22, 0.07],
        [0.12, 0.38, 0.1, 0.23, 0.6, 0.5]
   ]
}

This is just to show the structure of the expected dataset. HBOMS can also be used to generate pseudo-data for quick testing and model validation using the method hbm.simulate.

For now, lets assume that you have prepared a dataset data according to the template example_data. To check if the initial parameter values are any good, we can compare the simulated trajectories with the data using

fig = hbm.init_check(data)

init check

This shows the data for each unit in a separate panel, together with the simulated trajectories based on the initial parameter guess. If the initial guess is OK, we can fit the Stan model to the data:

hbm.sample(
    data,
    iter_warmup=200,
    iter_sampling=200,
    step_size=0.01,
    refresh=1,
    chains=1,
    threads_per_chain=12
)

the keyword arguments iter_warmup etc. are all passed to the CmdStanPy sample method (see the documentation of CmdStanModel.sample). Internally, the data dictionary is modified to make it compatible with the data structure that the Stan model expects. For instance, Stan does not allow rugged data as in the example_data. So padding is applied. Also, possible constant parameters are added to the data. The step_size keyword argument is quite important. If you have chosen good initial parameter values, then you don't want to make step_size too large as a large initial step size for the HMC algorithm will make it likely that the sampler quickly walks away from your carefully chosen parameters, before the HMC hyperparameters can be optimized. The threads_per_chain argument will determine how many CPU threads are used in parallel to run the model. If you have the luxary of access to many CPU cores, you can set this equal to the number of units, such that each initial value problem is solved on a different core.

After sampling is complete, you can check the quality of the fit using

fig = hbm.post_pred_check(data)

fit

We can now extract the fitted Stan model and get access to the posterior parameter distributions as follows:

a_hat = hbm.fit.stan_variable("a") # hbm.fit is a CmdStanMCMC object

Tutorials

In the folder docs/notebooks there are a number of tutorials, demonstrating the features of HBOMS. Documentation for HBOMS is hosted on hboms.readthedocs.io.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hboms-0.1.0.tar.gz (113.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hboms-0.1.0-py3-none-any.whl (131.6 kB view details)

Uploaded Python 3

File details

Details for the file hboms-0.1.0.tar.gz.

File metadata

  • Download URL: hboms-0.1.0.tar.gz
  • Upload date:
  • Size: 113.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for hboms-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c4504b669e03107e617111cd5f7e535f5e99e774658658d4b00b4603579c50d3
MD5 08a3f0de1e3f3ee0c9abf2697c1372cc
BLAKE2b-256 cadc321e755efec18757c8c82bc5a76e2b1745187db15c1ca42a30abd592b6f3

See more details on using hashes here.

File details

Details for the file hboms-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: hboms-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 131.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for hboms-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 da46926b7843aaf28fb9704da9b5649f92444ef62274fef3f76969c3003202b5
MD5 ee154602f22d087d7d4068ba66b310db
BLAKE2b-256 58ae1848990638890f212c9c9f8c7e7f43895e583ee452bd95c4f00ccb02d1a2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page