Skip to main content

Efficiently generate samples from the Polya-Gamma distribution using a NumPy/SciPy compatible interface.

Project description

polya-gamma

PyPI - Wheel PyPI PyPI - License CircleCI Codecov PyPI - Downloads

Efficiently generate samples from the Polya-Gamma distribution using a NumPy/SciPy compatible interface. densities

Features

  • polyagamma is written in C and optimized for performance.
  • Very light and easy to install (pre-built wheels).
  • It is flexible and allows the user to sample using one of 4 available methods.
  • Input parameters can be scalars, arrays or both; allowing for easy generation of multi-dimensional samples without specifying the size.
  • Random number generation is thread safe.
  • The functional API resembles that of common numpy/scipy functions, therefore making it easy to plugin to existing libraries.

Dependencies

  • Numpy >= 1.17

Installation

To get the latest version of the package, one can install it by downloading the wheel/source distribution from the releases page, or using pip with the following shell command:

$ pip install polyagamma

To install the latest pre-release version, use:

$ pip install --pre -U polyagamma

Alternatively, once can install from source by cloning the repo. This requires an installation of poetry and the following shell commands:

$ git clone https://github.com/zoj613/polya-gamma.git
$ cd polya-gamma/
$ poetry install
# add package to python's path
$ export PYTHONPATH=$PWD:$PYTHONPATH 

Example

Python

import numpy as np
from polyagamma import polyagamma

# generate a PG(1, 0) sample
o = polyagamma()

# Get a 5 by 10 array of PG(1, 2) variates.
o = polyagamma(z=2, size=(5, 10))

# Pass sequences as input. Numpy's broadcasting rules apply here.
h = [[1.5, 2, 0.75, 4, 5],
     [9.5, 8, 7, 6, 0.9]]
o = polyagamma(h, -2.5)

# Pass an output array
out = np.empty(5)
polyagamma(out=out)
print(out)

# one can choose a sampling method from {devroye, alternate, gamma, saddle}.
# If not given, the default behaviour is a hybrid sampler that picks a method
# based on the parameter values.
o = polyagamma(method="saddle")

# one can also use an existing instance of `numpy.random.Generator` as a parameter.
# This is useful to reproduce samples generated via a given seed.
rng = np.random.default_rng(12345)
o = polyagamma(random_state=rng)

# If one is using a `numpy.random.RandomState` instance instead of the `Generator`
# class, the object's underlying bitgenerator can be passed as the value of random_state
bit_gen = np.random.RandomState(12345)._bit_generator
o = polyagamma(random_state=bit_gen)

# When passing a large input array for the shape parameter `h`, parameter value
# validation checks can be disabled to avoid some overhead, which may boost performance.
large_h = np.ones(1000000)
o = polyagamma(large_h, disable_checks=True)

C

For an example of how to use polyagamma in a C program, see here.

Benchmarks

Below are runtime plots of 20000 samples generated for various values of h and z, using each method. We restrict h to integer values to accomodate the devroye method, which cannot be used for non-integer h. The version of the package used to generate them is v1.1.0.

Generally:

  • The gamma method is slowest and should be avoided in cases where speed is paramount.
  • For h > 15, the saddle method is the fastest for any value of z.
  • For z <= 1 and integer h <= 15, the devroye method is the most efficient.
  • For z > 1 and integer/non-integer h <= 15, the alternate method is the most efficient.
  • For h > 50 (or any value large enough), the normal approximation to the distribution is fastest (not reported in the above plot but it is around 10 times faster than the saddle method and also equally accurate).

Therefore, we devise a "hybrid/default" sampler that picks a sampler based on the above guidelines.

We also benchmark the hybrid sampler runtime with the sampler found in the pypolyagamma package (version 1.2.3). The version of NumPy we use is 1.20.1. We use the pgdrawv function which takes arrays as input. Below are runtime plots of 20000 samples for each value of h and z. Values of h range from 0.1 to 60, while z is set to 0, 2.5, 5, and 10.

It can be seen that when generating many samples at once for any given combination of parameters, polyagamma outperforms the pypolyagamma package. The exception is when h < 1. We rely on the saddle method to generate samples when the shape parameter is very small, which is not very efficient for such values. For values of h larger than 50, we use the normal approximation, which explains the large dip in runtime past h=50. It is also worth noting that the pypolygamma package is on average faster than ours at generating exactly one sample from the distribution. This is mainly due to the overhead introduced by creating the bitgenerator + acquiring/releasing the thread lock + doing parameter validation checks at every call to the function. This overhead can somewhat be mitigated by passing in a random generator instance at every call to the polyagamma function. For example, on an iPython session:

In [4]: %timeit polyagamma()
83.2 µs ± 1.02 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [5]: %timeit polyagamma(random_state=rng)
2.68 µs ± 29.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [6]: %timeit polyagamma(random_state=rng, disable_checks=True)
2.58 µs ± 22.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

To generate the above plots locally, run python scripts/benchmark.py --size=<some size> --z=<z value>. Note that the runtimes may differ than the ones reported here, depending on the machine this script is ran on.

Contributing

All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome.

To submit a PR, follow the steps below:

  1. Fork the repo.
  2. Setup the dev environment with poetry install. All dependencies will be installed.
  3. Start writing your changes, including unittests.
  4. Once finished, run make install to build the project with the new changes.
  5. Once build is successful, run tests to make sure they all pass with make test.
  6. Once finished, you can submit a PR for review.

References

  • Luc Devroye. "On exact simulation algorithms for some distributions related to Jacobi theta functions." Statistics & Probability Letters, Volume 79, Issue 21, (2009): 2251-2259.
  • Polson, Nicholas G., James G. Scott, and Jesse Windle. "Bayesian inference for logistic models using Pólya–Gamma latent variables." Journal of the American statistical Association 108.504 (2013): 1339-1349.
  • J. Windle, N. G. Polson, and J. G. Scott. "Improved Polya-gamma sampling". Technical Report, University of Texas at Austin, 2013b.
  • Windle, Jesse, Nicholas G. Polson, and James G. Scott. "Sampling Polya-Gamma random variates: alternate and approximate techniques." arXiv preprint arXiv:1405.0506 (2014)
  • Windle, J. (2013). Forecasting high-dimensional, time-varying variance-covariance matrices with high-frequency data and sampling Pólya-Gamma random variates for posterior distributions derived from logistic likelihoods.(PhD thesis). Retrieved from http://hdl.handle.net/2152/21842 .

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polyagamma-1.1.0.tar.gz (147.7 kB view details)

Uploaded Source

Built Distributions

polyagamma-1.1.0-cp39-cp39-manylinux2014_x86_64.whl (587.9 kB view details)

Uploaded CPython 3.9

polyagamma-1.1.0-cp39-cp39-manylinux2010_x86_64.whl (556.8 kB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

polyagamma-1.1.0-cp39-cp39-manylinux1_x86_64.manylinux2010_x86_64.whl (556.8 kB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

polyagamma-1.1.0-cp38-cp38-manylinux2014_x86_64.whl (576.0 kB view details)

Uploaded CPython 3.8

polyagamma-1.1.0-cp37-cp37m-manylinux2014_x86_64.whl (560.0 kB view details)

Uploaded CPython 3.7m

polyagamma-1.1.0-cp37-cp37m-manylinux2010_x86_64.whl (529.4 kB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

polyagamma-1.1.0-cp37-cp37m-manylinux2010_x86_64.manylinux1_x86_64.whl (529.4 kB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

File details

Details for the file polyagamma-1.1.0.tar.gz.

File metadata

  • Download URL: polyagamma-1.1.0.tar.gz
  • Upload date:
  • Size: 147.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.8.6

File hashes

Hashes for polyagamma-1.1.0.tar.gz
Algorithm Hash digest
SHA256 793ef1329763d64a603d7aa1de01377d2cbb7294501fffb7b036844ffae2f62b
MD5 91e88be5f59c37ec996f140adc944e64
BLAKE2b-256 1a2d10b5ed13a0a781f5eed5f7452390df8f17dddc2954d7f3b0eef8254b027e

See more details on using hashes here.

File details

Details for the file polyagamma-1.1.0-cp39-cp39-manylinux2014_x86_64.whl.

File metadata

  • Download URL: polyagamma-1.1.0-cp39-cp39-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 587.9 kB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.8.6

File hashes

Hashes for polyagamma-1.1.0-cp39-cp39-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 39bef418f1bbb106c5d1a6038142cd327c191d2db114a7d6531655022220ee47
MD5 5e53b7e69b76fac9c70cd52511e14729
BLAKE2b-256 7290f9266519d110f6209491195f43b3a1f474b002d7006ac5af95fcbe04b2a9

See more details on using hashes here.

File details

Details for the file polyagamma-1.1.0-cp39-cp39-manylinux2010_x86_64.whl.

File metadata

  • Download URL: polyagamma-1.1.0-cp39-cp39-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 556.8 kB
  • Tags: CPython 3.9, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.8.6

File hashes

Hashes for polyagamma-1.1.0-cp39-cp39-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 1a44e9cf5e277d2e0a150b874a88b0e2702add99684f95402fc1984832ffa641
MD5 b96c3162efe9b2a6cf88c21c103d86a7
BLAKE2b-256 3dcfc2ff58e3c1540401796fc1cff907b80cf4a092a2e913648b4f0324c934c7

See more details on using hashes here.

File details

Details for the file polyagamma-1.1.0-cp39-cp39-manylinux1_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for polyagamma-1.1.0-cp39-cp39-manylinux1_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 99be886ae76b84aa064b23fa43d17d4edc1714bf38fc8b34657b50d112c3b2bb
MD5 f27a1cc4318e230b69b4bc9148e53fd2
BLAKE2b-256 c12dd2f729641531a4f52259e5412bf2440e000a0ff3abee219cf030e33288bd

See more details on using hashes here.

File details

Details for the file polyagamma-1.1.0-cp38-cp38-manylinux2014_x86_64.whl.

File metadata

  • Download URL: polyagamma-1.1.0-cp38-cp38-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 576.0 kB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.8.6

File hashes

Hashes for polyagamma-1.1.0-cp38-cp38-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 bb2e19c11791e2a8a0072e9870e414d0ce7cf84f662d9a6a95f97e4eb8f7ba19
MD5 22ab17b890b2dadb43ca32422838fbc6
BLAKE2b-256 8a4c6f034c95b33a8b72d4ab11fd1ba9179ee780fd29b79f2680ac2a2d87b95f

See more details on using hashes here.

File details

Details for the file polyagamma-1.1.0-cp37-cp37m-manylinux2014_x86_64.whl.

File metadata

  • Download URL: polyagamma-1.1.0-cp37-cp37m-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 560.0 kB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.8.6

File hashes

Hashes for polyagamma-1.1.0-cp37-cp37m-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 32234c9f19bbcdd14f0a15262582ed1d424c6d7d4a96d9af84456f1b0e4793f4
MD5 6148ea20a8ebc9cae22000194bbae0bd
BLAKE2b-256 eecea6f04e21e8f70bc5fc657b967fbbeea59b2770c64c1f51fb95c263b36545

See more details on using hashes here.

File details

Details for the file polyagamma-1.1.0-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: polyagamma-1.1.0-cp37-cp37m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 529.4 kB
  • Tags: CPython 3.7m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.8.6

File hashes

Hashes for polyagamma-1.1.0-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 16fa44a194946dda5d84f8269980293dcc8dc87abe15a626c9adf4d6089c9b33
MD5 beb99e18413b14979b0cb61fb6626e4b
BLAKE2b-256 40736ba36a33131aa9a95310b3ed1f12a4c55411a54ce35cda9fb157e1e3b486

See more details on using hashes here.

File details

Details for the file polyagamma-1.1.0-cp37-cp37m-manylinux2010_x86_64.manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for polyagamma-1.1.0-cp37-cp37m-manylinux2010_x86_64.manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 31e4a26e17cb5dba0518c0e418b45ec0512ca7c94234f83c0d2ace5d67d6ee72
MD5 afbe4c2781046e980193e0d55863a006
BLAKE2b-256 4693ce11f2a1db59f43c2c17d909e00425dcd7b819ffde35615512c8c633357e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page