Skip to main content

A Python package for Monte Carlo sampling.

Project description

probabilit

A small Python package for Monte Carlo modeling.

  • User friendly API with a modeling language.
  • Built on scipy and numpy.
  • Supports composite distributions (e.g. the mean of a distribution can be a distribution).
  • Supports Quasi-Monte Carlo sampling, e.g. Sobol, Halton and LHS.
  • Supports inducing correlations with Iman-Conover.

Modeling

The modeling API lets you create a computational graph, where each node is either a distribution, a constant or a transformation. Once the method .sample() is called on a node, each ancestor node is sampled in turn.

Example 1 - Height. What is the probability that a man is taller than a woman?

>>> from probabilit.modeling import Distribution
>>> male_height = Distribution("norm", loc=176, scale=7.1)
>>> female_height = Distribution("norm", loc=162.5, scale=7.1)
>>> statistic = (male_height > female_height)
>>> samples = statistic.sample(999, random_state=0)
>>> float(samples.mean())
0.9039...

When statistic is sampled in the code above, the ancestor nodes male_height and female_height are sampled too. In each node, results are stored in the samples_ attribute.

>>> import pandas as pd
>>> pd.Series(male_height.samples_).describe()
count    999.000000
mean     176.096916
std        7.326152
min      153.210671
25%      171.209087
50%      176.044660
75%      180.948085
max      201.216617
dtype: float64

Example 2 - Bird survival. This example illustrates composite distributions, where the argument to one distribution is another distribution. Suppose we have a distribution governing the number off eggs per bird nest for a certain species, and each hatched bird a survival probability. What is the distribution of the number of birds that survive per nest?

>>> eggs_per_nest = Distribution("poisson", mu=3)
>>> survivial_prob = 0.4
>>> survived = Distribution("binom", n=eggs_per_nest, p=survivial_prob)
>>> survived.sample(9, random_state=0) # Sample a few values only
array([2., 1., 1., 2., 2., 2., 2., 0., 0.])

Example 3 - Mutual fund. Suppose we save 1200 units of money per year and that the yearly interest rate has a distribution N(1.11, 0.15). How much money will we have after 20 years?

>>> saved_per_year = 1200
>>> returns = 0
>>> for year in range(20):
...     interest = Distribution("norm", loc=1.11, scale=0.15)
...     returns = returns * interest + saved_per_year
>>> samples = returns.sample(999, random_state=42)
>>> samples.mean(), samples.std()
(np.float64(76583.58738496085), np.float64(33483.2245611436))

Low-level API

The low-level API contains numpy functions for working with random variables. The two most important ones are (1) the nearest_correlation_matrix function and (2) the ImanConover class.

Fixing user-supplied correlation matrices. The function nearest_correlation_matrix can be used to fix user-specified correlation matrices, which are often invalid. Below a user has specified some correlations, but the resulting correlation matrix has a negative eigenvalue and is not positive definite.

>>> import numpy as np
>>> from probabilit.correlation import nearest_correlation_matrix
>>> X = np.array([[1, 0.9, 0],
...               [0.9, 1, 0.8],
...               [0, 0.8, 1]])
>>> np.linalg.eigvals(X) # Not a valid correlation matrix
array([-0.20415946,  1.        ,  2.20415946])
>>> nearest_correlation_matrix(X)
array([[1.        , 0.77523696, 0.07905637],
       [0.77523696, 1.        , 0.69097837],
       [0.07905637, 0.69097837, 1.        ]])
>>> np.linalg.eigvals(nearest_correlation_matrix(X))
array([2.07852823e+00, 9.21470108e-01, 1.66710188e-06])

Inducing correlations on samples. The class ImanConover can be used to induce correlations on uncorrelated variables without changing the marginal distributions. There's no guarantee that we're able to achieve the desired correlation structure, but in practice we often get close.

>>> import scipy as sp
>>> from probabilit.correlation import ImanConover
>>> sampler = sp.stats.qmc.LatinHypercube(d=2, seed=42, scramble=True)
>>> samples = sampler.random(n=100)
>>> X = np.vstack((sp.stats.triang(0.5).ppf(samples[:, 0]),
...                sp.stats.gamma.ppf(samples[:, 1], a=1))).T

Now we can induce correlations:

>>> float(sp.stats.pearsonr(*X.T).statistic)
0.065898...
>>> correlation_matrix = np.array([[1, 0.3], [0.3, 1]])
>>> transform = ImanConover().set_target(correlation_matrix)
>>> X_transformed = transform(X)
>>> float(sp.stats.pearsonr(*X_transformed.T).statistic)
0.279652...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

probabilit-0.0.1.tar.gz (56.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

probabilit-0.0.1-py3-none-any.whl (48.1 kB view details)

Uploaded Python 3

File details

Details for the file probabilit-0.0.1.tar.gz.

File metadata

  • Download URL: probabilit-0.0.1.tar.gz
  • Upload date:
  • Size: 56.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for probabilit-0.0.1.tar.gz
Algorithm Hash digest
SHA256 715812a78586da4d9a36f53d1971120369ce0d23713decd85c2d6b8ad602181d
MD5 2f3f45ebadbd1781b06c3b198e4e1db3
BLAKE2b-256 9dac06d6a9dbae7816c57cf34bd555d23665f90ca1193d9ed8cbec9df052640b

See more details on using hashes here.

File details

Details for the file probabilit-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: probabilit-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 48.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for probabilit-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 279a9d733fc70fd74c791172f8bc722ee1617de01695a84bf887b348288d952c
MD5 987aea8bdccee46ff4609ccc76acc0ba
BLAKE2b-256 8471c61d047089197ef71da2b0dab9fc3d4a6a26f063242d7fe46a61275dce60

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page