Optimally compress sampling algorithm outputs
Project description
Stein Thinning
This Python package implements an algorithm for optimally compressing sampling algorithm outputs by minimising a kernel Stein discrepancy. Please see the accompanying paper "Optimal Thinning of MCMC Output" (arXiv) for details of the algorithm.
Installing the package
The latest stable version can be installed via pip:
pip install stein-thinning
To install the current development version, use this command:
pip install git+https://github.com/wilson-ye-chen/stein_thinning
Getting Started
For example, correlated samples from a posterior distribution are
obtained using a MCMC algorithm and stored in the NumPy array smpl
,
and the corresponding gradients of the log-posterior are stored in
another NumPy array grad
. One can then perform Stein Thinning to
obtain a subset of 40 sample points by running the following code:
from stein_thinning.thinning import thin
idx = thin(smpl, grad, 40)
The thin
function returns a NumPy array containing the row indices
in smpl
(and grad
) of the selected points. Please refer to demo.py
as a starting example.
The default usage requires no additional user input and is based on
the identity (id
) preconditioning matrix and standardised sample.
Alternatively, the user can choose to specify which heuristic to use
for computing the preconditioning matrix by setting the option string
to either id
, med
, sclmed
, or smpcov
. Standardisation can be
disabled by setting stnd=False
. For example, the default setting
corresponds to:
idx = thin(smpl, grad, 40, stnd=True, pre='id')
The details for each of the heuristics are documented in Section 2.3 of the accompanying paper.
PyStan Example
As an illustration of how Stein Thinning can be used to post-process output from Stan, consider the following simple Stan script that produces correlated samples from a bivariate Gaussian model:
from pystan import StanModel
mc = """
parameters {vector[2] x;}
model {x ~ multi_normal([0, 0], [[1, 0.8], [0.8, 1]]);}
"""
sm = stan.build(mc, random_seed=12345)
fit = sm.sample(num_samples=1000)
The bivariate Gaussian model is used for illustration, but regardless of
the complexity of the model being sampled the output of Stan will always
be a fit
object (StanFit instance). The sampled points and the
log-posterior gradients can be extracted from the returned fit
object:
import numpy as np
sample = fit['x'].T
gradient = np.apply_along_axis(lambda x: sm.grad_log_prob(x.tolist()), 1, sample)
idx = thin(sample, gradient, 40)
The selected points can then be plotted:
plt.figure()
plt.scatter(sample[:, 0], sample[:, 1], color='lightgray')
plt.scatter(sample[idx, 0], sample[idx, 1], color='red')
plt.show()
The above example can be found in stein_thinning/demo/pystan.py
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file stein_thinning-0.2.0.tar.gz
.
File metadata
- Download URL: stein_thinning-0.2.0.tar.gz
- Upload date:
- Size: 27.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 226880b9e561aef2383030cbba5466276f29b65af4ce0adb277f244d983ae543 |
|
MD5 | a4e88a4cbda44c34863adcd61ce0cb84 |
|
BLAKE2b-256 | 1a3d8209c53f925a8c765891fa06c318b42f01f1995d1eee3e9f8a8198c7a1ea |
Provenance
The following attestation bundles were made for stein_thinning-0.2.0.tar.gz
:
Publisher:
publish_pypi.yml
on wilson-ye-chen/stein_thinning
-
Statement type:
https://in-toto.io/Statement/v1
- Predicate type:
https://docs.pypi.org/attestations/publish/v1
- Subject name:
stein_thinning-0.2.0.tar.gz
- Subject digest:
226880b9e561aef2383030cbba5466276f29b65af4ce0adb277f244d983ae543
- Sigstore transparency entry: 149350800
- Sigstore integration time:
- Predicate type:
File details
Details for the file stein_thinning-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: stein_thinning-0.2.0-py3-none-any.whl
- Upload date:
- Size: 27.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f72f8a5f6369ec72d1df36a5fe505c1bee9e103c704d05a2d912f12cbb76d5bc |
|
MD5 | 8a225e516a7e4a410229025a394912c2 |
|
BLAKE2b-256 | a26c7e6704ac8d2b1c30d3165164276f11003cc7988e2c62dee70c4685a3a5cd |
Provenance
The following attestation bundles were made for stein_thinning-0.2.0-py3-none-any.whl
:
Publisher:
publish_pypi.yml
on wilson-ye-chen/stein_thinning
-
Statement type:
https://in-toto.io/Statement/v1
- Predicate type:
https://docs.pypi.org/attestations/publish/v1
- Subject name:
stein_thinning-0.2.0-py3-none-any.whl
- Subject digest:
f72f8a5f6369ec72d1df36a5fe505c1bee9e103c704d05a2d912f12cbb76d5bc
- Sigstore transparency entry: 149350802
- Sigstore integration time:
- Predicate type: