Skip to main content

Formally verified biased coin and n-sided die

Project description

Zarpy: formally verified biased coin and n-sided die.

See the paper (to appear in PLDI'23) and Github repository.

Why use Zarpy?

Probabilistic choice

A basic operation in randomized algorithms is probabilistic choice: for some p ∈ [0,1], execute action a1 with probability p or action a2 with probability 1-p (i.e., flip a biased coin to determine the path of execution). A common method for performing probabilistic choice is as follows:

if random() < p:
  execute a1
else:
  execute a2

where p is a float in the range [0,1] and random() produces a random float in the range [0,1). While good enough for many applications, this approach is not always correct due to float roundoff error. We can only expect a1 to be executed with probability p + ϵ for some small error term ϵ, which technically invalidates any correctness guarantees of our overall system that depend on the correctness of its probabilistic choices.

Zarpy provides an alternative that is guaranteed (by formal proof in Coq) to execute a1 with probability p (where n and d are integers such that p = n/d):

from zarpy import build_coin, flip
build_coin((n, d)) # Build and cache coin with bias p = n/d
if flip(): # Generate a Boolean value with Pr(True) = p 
  execute a1
else:
  execute a2

Uniform sampling

Another common operation is to randomly draw from a finite collection of values with equal (uniform) probability of each. An old trick for drawing an integer uniformly from the range [0, n) is to generate a random integer from [0, RAND_MAX] and take the modulus wrt. n:

x = rand() % n # Assign x random value from [0,n)

but this method suffers from modulo bias when n is not a power of 2, causing some values to occur with higher probability than others (see, e.g., this article for more information on modulo bias). Zarpy provides a uniform sampler that is guaranteed for any integer 0 < n to generate samples from [0,n) with probability 1/n each:

from zarpy import build_die, roll
build_die(n)
x = roll()

Although the Python function random.randint is ostensibly free from modulo bias, our implementation guarantees so by a formal proof of correctness in Coq.

Trusted Computing Base

The samplers provided by Zarpy have been implemented and verified in Coq and extracted to OCaml and bundled into Python package via pythonlib. Validity of the correctness proofs is thus dependent on the correctness of Coq's extraction mechanism, the OCaml compiler and runtime, a small amount of OCaml shim code (viewable here), and the pythonlib library.

Proofs of correctness

The coin and die samplers are implemented as probabilistic programs in the Zar system and compiled to interaction trees implementing them via reduction to sequences of fair coin flips. See Section 3 of the paper for details and the file zarpy.v for their implementations and proofs of correctness.

Correctness is two-fold. For biased coin with bias p, we prove:

  • coin_itree_correct: the probability of producing true according to the formal probabilistic semantics of the constructed interaction tree is equal to p, and

  • coin_samples_equidistributed: when the source of random bits is uniformly distributed, for any sequence of coin flips the proportion of true samples converges to p as the number of samples goes to +∞.

The equidistribution result is dependent on uniform distribution of the Boolean values generated by OCaml's Random.bool function. See the paper for a more detailed explanation.

Similarly, the theorem die_itree_correct proves semantic correctness of the n-sided die, and die_samples_equidistributed equidistribution of its samples.

Usage

seed() initializes the PRNG via Random.self_init.

Biased coin

build_coin((num, denom)) builds and caches a coin with Pr(True) = num/denom for nonnegative integer num and positive integer denom.

flip() produces a single Boolean sample by flipping the cached coin.

flip_n(n) produces n Boolean samples by flipping the cached coin.

N-sided die

build_die(n) builds and caches an n-sided die with Pr(m) = 1/n for integer m where 0 <= m < n.

roll() produces a single sample by rolling the cached die.

roll_n(n) produces n integer samples by rolling the cached die.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zarpy-1.0.1.tar.gz (6.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zarpy-1.0.1-py3-none-any.whl (6.1 MB view details)

Uploaded Python 3

File details

Details for the file zarpy-1.0.1.tar.gz.

File metadata

  • Download URL: zarpy-1.0.1.tar.gz
  • Upload date:
  • Size: 6.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.10

File hashes

Hashes for zarpy-1.0.1.tar.gz
Algorithm Hash digest
SHA256 102b4a3f591eb034cae1847a74928953dcf5843146758fa363d75ec7d29567c6
MD5 7c672b2c45d7ce42a6c81111b0190ad9
BLAKE2b-256 4c560a7fc15f4a73008ee19e47a9bec2e9425776132dc131a9d35ae3e957d9fd

See more details on using hashes here.

File details

Details for the file zarpy-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: zarpy-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 6.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.10

File hashes

Hashes for zarpy-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 741d1b23f50622103a40404a40a9bfdc25f18f8138354a0ce70954f20531b047
MD5 8cb54012c7418996532e07d688b09a0a
BLAKE2b-256 6d8bc0aadb8a10c90c71e504c38e0bd36de0a34b279f0f40660f0abefdf440c0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page