Skip to main content

A library for sampling correlated binary variates.

Project description

bindata

CircleCI version PyPI Latest Release License

A python replication of the omonymous R library bindata, based on the paper "Generation of correlated artificial binary data.", by Friedrich Leisch, Andreas Weingessel, and Kurt Hornik.

The library fully replicates the existing R-package with the following functions:

  • bincorr2commonprob
  • check_commonprob (check.commonprob in R)
  • commonprob2sigma
  • condprob
  • ra2ba
  • rmvbin
  • simul_commonprob (simul.commonprob in R)

Precomputed (via Monte Carlo simulations) SimulVals are also available.

Installation

bindata can be installed with pip as:

pip install bindata

How to

Generate uncorrelated variates

import bindata as bnd

margprob = [0.3, 0.9]

X = bnd.rmvbin(N=100_000, margprob=margprob)

Now let's verify the sample marginals and correlations:

import numpy as np

print(X.mean(0))
print(np.corrcoef(X, rowvar=False))
[0.30102 0.9009 ]
[[ 1.         -0.00101357]
 [-0.00101357  1.        ]]

Generate correlated variates

From a correlation matrix

corr = np.array([[1., -0.25, -0.0625],
                 [-0.25,   1.,  0.25],
                 [-0.0625, 0.25, 1.]])
commonprob = bnd.bincorr2commonprob(margprob=[0.2, 0.5, 0.8], 
                                        bincorr=corr)

X = bnd.rmvbin(margprob=np.diag(commonprob), 
                   commonprob=commonprob, N=100_000)
print(X.mean(0))
print(np.corrcoef(X, rowvar=False))
[0.1996  0.50148 0.80076]
[[ 1.         -0.25552    -0.05713501]
 [-0.25552     1.          0.24412401]
 [-0.05713501  0.24412401  1.        ]]

From a joint probability matrix

commonprob = [[1/2, 1/5, 1/6],
              [1/5, 1/2, 1/6],
              [1/6, 1/6, 1/2]]
X = bnd.rmvbin(N=100_000, commonprob=commonprob)

print(X.mean(0))
print(np.corrcoef(X, rowvar=False))
[0.50076 0.50289 0.49718]
[[ 1.         -0.20195239 -0.33343712]
 [-0.20195239  1.         -0.34203855]
 [-0.33343712 -0.34203855  1.        ]]

For a more comprehensive documentation please consult the documentation.

Acknowledgements

Author

Luca Mingarelli, 2022

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bindata-0.9.2.tar.gz (100.7 kB view hashes)

Uploaded Source

Built Distribution

bindata-0.9.2-py3-none-any.whl (100.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page