Skip to main content

Fast Simulation of Hyperplane-Truncated Multivatiate Normal Distributions

Project description

htnorm

This repo provides a C implementation of a fast and exact sampler from a multivariate normal distribution (MVN) truncated on a hyperplane as described here

this repo implements the following from the paper:

  • efficient Sampling from a MVN truncated on a hyperplane:

    hptrunc

  • efficient sampling from a MVN with a stuctured precision matrix:

    struc

  • efficent sampling frfom a MVN with a structured precision and mean:

    strucmean

The algorithms implemented have the following practical applications:

  • Topic models when unknown parameters can be interpreted as fractions.
  • Admixture models
  • discrete graphical models
  • Sampling from posterior distribution of an Intrinsic Conditional Autoregressive prior icar
  • Sampling from posterior conditional distributions of various bayesian regression problems.

Dependencies

  • a C compiler that supports the C99 standard or later
  • an installation of BLAS and LAPACK that exposes its C interface via the headers <cblas.h> and <lapacke.h> (e.g openBLAS).

Usage

Building a shared library of htnorm can be done with the following:

$ cd src/
# optionally set path to CBLAS and LAPACKE headers using INCLUDE_DIRS environmental variable
$ export INCLUDE_DIRS="some/path/to/headers" 
# optionally set path to BLAS installation shared library
$ export LIBS_DIR="some/path/to/library/"
# optionally set the linker flag for your BLAS installation (e.g -lopenblas)
$ export LIBS=<flag here>
$ make lib

Afterwards the shared library will be found in a lib/ directory of the project root, and the library can be linked dynamically via -lhtnorm.

The puplic API exposes the samplers through the function declarations

  • int htn_hyperplane_truncated_mvn(rng_t* rng, const ht_config_t* conf, double* out)
  • int htn_structured_precision_mvn(rng_t* rng, const sp_config_t* conf, double* out)

The details of the parameters are documented in ther header files "htnorm.h".

Random number generation is done using PCG64 or Xoroshiro128plus bitgenerators. The API allows using a custom generator, and the details are documented in the header file "rng.h".

Examples

#include "htnorm.h"

int main ()
{
    ...

    // instantiate a random number generator
    rng_t* rng = rng_new_pcg64();
    ht_config_t config;
    config.g = ...; // G matrix
    config.gnrow = ...; // number of rows of G
    config.gncol = ...; // number of columns of G
    cofig.r = ...; // r array
    config.mean = ...; // mean array
    config.cov = ...; // the convariance matrix
    confi.diag = ...; // whether covariance is diagonal

    double* samples = ...; // array to store the samples
    // now call the sampler
    int res_info = htn_hyperplane_truncated_mvn(rng, &config, samples);

    // res_info contains a number that indicates whether sampling failed or not.

    ...

    // finally free the RNG pointer at some point
    rng_free(rng);

    ...
    return 0;
}

Python API

A high level python interface to the library is also provided. Linux users can install it using wheels via pip (thus not needing to worry about availability of C libraries),

pip install pyhtnorm

Wheels are not provided for MacOS. To install via pip, one can run the following commands:

#set the path to BLAS installation headers
export INCLUDE_DIR=<path/to/headers>
#set the path to BLAS shared library
export LIBS_DIR=<some directory>
#set the name of the BLAS shared library (e.g. "openblas")
export LIBS=<lib name>
# finally install via pip so the compilation and linking can be done correctly
pip install pyhtnorm

Alternatively, one can install it from source. This requires an installation of poetry and the following shell commands:

$ git clone https://github.com/zoj613/htnorm.git
$ cd htnorm/
$ poetry install
# add htnorm to python's path
$ export PYTHONPATH=$PWD:$PYTHONPATH

Below is an example of how to use htnorm in python to sample from a multivariate gaussian truncated on the hyperplane sumzero (i.e. making sure the sampled values sum to zero)

from pyhtnorm import HTNGenerator
import numpy as np

rng = HTNGenerator()

# generate example input
k1 = 1000
k2 = 1
npy_rng = np.random.default_rng()
temp = npy_rng.random((k1, k1))
cov = temp @ temp.T + np.diag(npy_rng.random(k1))
G = np.ones((k2, k1))
r = np.zeros(k2)
mean = npy_rng.random(k1)

samples = rng.hyperplane_truncated_mvnorm(mean, cov, G, r)
# verify if sampled values sum to zero
print(sum(samples))

# alternatively one can pass an array to store the results in
out = np.empty(k1)
rng.hyperplane_truncated_mvnorm(mean, cov, G, r, out=out)
# verify
print(out.sum())

For more details about the parameters of the HTNGenerator and its methods, see the docstrings via python's help function.

The python API also exposes the HTNGenerator class as a Cython extension type that can be "cimported" in a cython script.

TODO

  • Add an R interface to the library.

Licensing

htnorm is free software made available under the BSD-3 License. For details see the LICENSE file.

References

  • Cong, Yulai; Chen, Bo; Zhou, Mingyuan. Fast Simulation of Hyperplane-Truncated Multivariate Normal Distributions. Bayesian Anal. 12 (2017), no. 4, 1017--1037. doi:10.1214/17-BA1052.
  • Bhattacharya, A., Chakraborty, A., and Mallick, B. K. (2016). “Fast sampling with Gaussian scale mixture priors in high-dimensional regression.” Biometrika, 103(4):985.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhtnorm-0.1.0.tar.gz (144.7 kB view hashes)

Uploaded Source

Built Distributions

pyhtnorm-0.1.0-cp38-cp38-manylinux2014_x86_64.whl (11.0 MB view hashes)

Uploaded CPython 3.8

pyhtnorm-0.1.0-cp38-cp38-manylinux2010_x86_64.whl (10.0 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

pyhtnorm-0.1.0-cp38-cp38-manylinux1_x86_64.whl (14.1 MB view hashes)

Uploaded CPython 3.8

pyhtnorm-0.1.0-cp37-cp37m-manylinux2014_x86_64.whl (10.9 MB view hashes)

Uploaded CPython 3.7m

pyhtnorm-0.1.0-cp37-cp37m-manylinux2010_x86_64.whl (9.9 MB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

pyhtnorm-0.1.0-cp37-cp37m-manylinux1_x86_64.whl (14.1 MB view hashes)

Uploaded CPython 3.7m

pyhtnorm-0.1.0-cp36-cp36m-manylinux2014_x86_64.whl (10.9 MB view hashes)

Uploaded CPython 3.6m

pyhtnorm-0.1.0-cp36-cp36m-manylinux2010_x86_64.whl (9.9 MB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

pyhtnorm-0.1.0-cp36-cp36m-manylinux1_x86_64.whl (14.1 MB view hashes)

Uploaded CPython 3.6m

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page