Skip to main content

Cython implementation of Vose's Alias method.

Project description

vose

This is a Cython implemention of Michael Vose's Alias method. It can be used to perform weighted sampling with replacement of integers in O(1) time. It requires a construction phase that runs in O(n) time, with n being the number of integers with associated weights. As far as I know, it's faster than any other method available in Python. But I would love to be proven wrong!

I wrote this because I had a specific usecase where I needed to repeatidly sample integers with a weight associated to each integer. I stumbled on Keith Schwarz's Darts, Dice, and Coins: Sampling from a Discrete Distribution, which is very well written, and decided to use the Alias method. Alas, numpy doesn't seem to have it available, and neither does the random module from Python's standard library. There is, however, the vose_sampler package, but it is written in pure Python and isn't fast enough for my purposes. I therefore decided to write it in Cython and shamelessly adapted Keith Schmarz's Java implementation.

Installation

pip install vose

Usage

You first have to initialize a sampler with an array of weights. These weights are not required to sum up to 1.

>>> import numpy as np
>>> import vose

>>> weights = np.array([.1, .3, .4, .2])
>>> sampler = vose.Sampler(weights, seed=42)

You can then call the .sample() method to sample a random integer in range [0, n - 1], where n is the number of weights that were passed.

>>> sampler.sample()
3

You can set the k parameter in order to produce multiple samples.

>>> sampler.sample(k=10)
array([3, 3, 2, 1, 1, 2, 2, 3, 0, 2])

By default, a copy of the weights is made. You can disable this behavior in order to save a few microseconds, but this will modify the provided array.

>>> sampler = vose.Sampler(weights, seed=42, copy=False)

Note that vose.Sampler expects to be provided with a memoryview. In order to pass a list, you have to convert it yourself to a numpy array.

>>> weights = [.2, .3, .5]
>>> sampler = vose.Sampler(np.array(weights))

You can also use vose.Sampler from within your own Cython .pyx file:

import numpy as np

cimport vose
cimport numpy as np

cdef np.float [:] weights = np.array([.2, .3, .5], dtype=float)
cdef vose.Sampler sampler
sampler = vose.Sampler(weights)

cdef int sample = sampler.sample_1()
cdef np.int [:] samples = sampler.sample_k(10)

Note that the latter requires having to include the numpy headers in the extension definition of your setup.py:

from setuptools import Extension
from setuptools import setup
from Cython.Build import cythonize
import numpy as np

extension = Extension(
    '*', ['your_file.pyx'],
    include_dirs=[np.get_include()],
    define_macros=[('NPY_NO_DEPRECATED_API', 'NPY_1_7_API_VERSION')]
)

setup(ext_modules=cythonize([extension]))

Is it reliable?

It seems to be working correctly; at least according to the following chi-squared tests:

>>> import numpy as np
>>> from scipy import stats

>>> rng = np.random.default_rng(seed=42)
>>> k = 1000

>>> for n in range(3, 20):
...     weights = rng.dirichlet(np.arange(1, n))
...     sampler = vose.Sampler(weights, seed=42)
...     samples = sampler.sample(k)
...     chi2 = stats.chisquare(f_obs=np.bincount(samples), f_exp=weights * k)
...     assert chi2.pvalue > .05

It is also reproducible:

>>> import numpy as np
>>> import vose
>>> probs = np.array([0.5, 0.5])
>>> a = vose.Sampler(probs, seed=0)
>>> b = vose.Sampler(probs, seed=0)
>>> for _ in range(10000):
...     assert a.sample() == b.sample()

Note that the seed method can be used to set the state of the sampler's RNG without having to re-initialize the weights:

>>> import numpy as np
>>> import vose
>>> probs = np.ones(5)
>>> a = vose.Sampler(probs, seed=3)
>>> a.sample(4)
array([2, 2, 2, 3])
>>> a.seed(3)
>>> a.sample(4)
array([2, 2, 2, 3])

Is it fast?

Hell yeah. The following graph shows the average time taken to sample one integer for different amounts of weights:


As you can see, vose.Sampler takes less than a nanosecond to produce a random integer. Here is the construction time:


vose.Sampler is also fast enough to compete with numpy and random, even when including the construction time. The following table summarizes a comparison I made on my laptop, with n being the number of weights and k the number of integers to sample:

n k np.random.choice random.choices vose.Sampler vose_sampler.VoseAlias
3 1 26ns 2ns 4ns 11ns
3 2 26ns 3ns 7ns 13ns
3 3 26ns 3ns 7ns 14ns
3 10 26ns 6ns 7ns 27ns
3 100 28ns 47ns 8ns 198ns
3 1000 46ns 461ns 19ns 1μs, 887ns
30 1 27ns 6ns 4ns 69ns
30 2 26ns 7ns 7ns 73ns
30 3 27ns 7ns 8ns 72ns
30 10 27ns 14ns 7ns 88ns
30 100 31ns 63ns 8ns 256ns
30 1000 67ns 580ns 19ns 1μs, 935ns
300 1 29ns 47ns 6ns 661ns
300 2 29ns 47ns 9ns 659ns
300 3 29ns 49ns 9ns 685ns
300 10 29ns 54ns 9ns 685ns
300 100 36ns 112ns 10ns 877ns
300 1000 96ns 717ns 20ns 2μs, 599ns
3000 1 52ns 416ns 18ns 6μs, 988ns
3000 2 50ns 420ns 21ns 7μs, 39ns
3000 3 51ns 439ns 21ns 7μs, 102ns
3000 10 51ns 420ns 21ns 7μs, 332ns
3000 100 59ns 496ns 23ns 7μs, 349ns
3000 1000 137ns 1μs, 213ns 35ns 10μs, 190ns

In summary, you probably don't need to be using vose.Sampler if you only need to sample once, regardless of the number of integers you wish to sample. You want to use vose.Sampler when you need to sample in a sequential manner, because at that point the construction time will be amortized. Indeed, this will bring you two orders of magnitude improved speed, when compared to calling np.random.choice or random.choices multiple times.

Development

git clone https://github.com/MaxHalford/vose
cd vose
uv sync
uv run pytest

Further work

  • The weights assigned to each integer cannot be modified. A search tree can be used as a data structure that supports modifications. This allows modifying weights in O(log(n)) time, but means sampling also happens in O(log(n)) time. More information here.
  • I'm not 100% the memory allocation of the memoryviews is optimal.
  • Initializing a vose.Sampler from another Cython .pyx file seems to require some Python interaction; there's probably a way to avoid this.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vose-0.2.3.tar.gz (171.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

vose-0.2.3-cp313-cp313-win_amd64.whl (248.2 kB view details)

Uploaded CPython 3.13Windows x86-64

vose-0.2.3-cp313-cp313-win32.whl (235.3 kB view details)

Uploaded CPython 3.13Windows x86

vose-0.2.3-cp313-cp313-musllinux_1_2_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.13musllinux: musl 1.2+ x86-64

vose-0.2.3-cp313-cp313-musllinux_1_2_aarch64.whl (1.7 MB view details)

Uploaded CPython 3.13musllinux: musl 1.2+ ARM64

vose-0.2.3-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (755.5 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

vose-0.2.3-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (743.8 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.24+ ARM64manylinux: glibc 2.28+ ARM64

vose-0.2.3-cp313-cp313-macosx_10_13_universal2.whl (349.5 kB view details)

Uploaded CPython 3.13macOS 10.13+ universal2 (ARM64, x86-64)

vose-0.2.3-cp312-cp312-win_amd64.whl (248.5 kB view details)

Uploaded CPython 3.12Windows x86-64

vose-0.2.3-cp312-cp312-win32.whl (235.5 kB view details)

Uploaded CPython 3.12Windows x86

vose-0.2.3-cp312-cp312-musllinux_1_2_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ x86-64

vose-0.2.3-cp312-cp312-musllinux_1_2_aarch64.whl (1.7 MB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ ARM64

vose-0.2.3-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (756.7 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

vose-0.2.3-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (745.2 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.24+ ARM64manylinux: glibc 2.28+ ARM64

vose-0.2.3-cp312-cp312-macosx_10_13_universal2.whl (351.3 kB view details)

Uploaded CPython 3.12macOS 10.13+ universal2 (ARM64, x86-64)

File details

Details for the file vose-0.2.3.tar.gz.

File metadata

  • Download URL: vose-0.2.3.tar.gz
  • Upload date:
  • Size: 171.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vose-0.2.3.tar.gz
Algorithm Hash digest
SHA256 be3b343c7956108e07ed849b87b51f27afc9744fcb02b151e8bf3df78fd6e448
MD5 788fc72be512eaeaa2f0c1d1fc7ccb96
BLAKE2b-256 f4f3058ea855a301099f22c822fdc79c8e2a8ecd674e4285a05207f070f7b4b7

See more details on using hashes here.

Provenance

The following attestation bundles were made for vose-0.2.3.tar.gz:

Publisher: publish.yml on MaxHalford/vose

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vose-0.2.3-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: vose-0.2.3-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 248.2 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vose-0.2.3-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 dc0353999c478fb63e26cbfc3af0822ad810f211b65cc87fe604652e1963f483
MD5 00860c0ee54750214dc884c668e2ef63
BLAKE2b-256 ce9f19316e4ad522a58de637a7e9b7e8042a04d812bc21792b7908e53ed694ad

See more details on using hashes here.

Provenance

The following attestation bundles were made for vose-0.2.3-cp313-cp313-win_amd64.whl:

Publisher: publish.yml on MaxHalford/vose

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vose-0.2.3-cp313-cp313-win32.whl.

File metadata

  • Download URL: vose-0.2.3-cp313-cp313-win32.whl
  • Upload date:
  • Size: 235.3 kB
  • Tags: CPython 3.13, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vose-0.2.3-cp313-cp313-win32.whl
Algorithm Hash digest
SHA256 f86e4dbd428e9ac9eefe6dc9c2b7c30ecfe540cf6a8df670f93604cd0ee4dd7c
MD5 0dcf6b5506a369592ba540b2c6999a54
BLAKE2b-256 61bc3ce236b5428d5f74f37ee091e5b24c0f41dee631094e3826deacd00b917b

See more details on using hashes here.

Provenance

The following attestation bundles were made for vose-0.2.3-cp313-cp313-win32.whl:

Publisher: publish.yml on MaxHalford/vose

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vose-0.2.3-cp313-cp313-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for vose-0.2.3-cp313-cp313-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 3d7d4c964dc80fa8f77c9df96ecc1bde66375afb43bda459a00f5ad0f33f215d
MD5 f84467114d80b51d73bee3ada54beed0
BLAKE2b-256 ac6358cbb6e4534a28654d4b9833d73b1c2f97959c53f479ff2bea5d11d3a556

See more details on using hashes here.

Provenance

The following attestation bundles were made for vose-0.2.3-cp313-cp313-musllinux_1_2_x86_64.whl:

Publisher: publish.yml on MaxHalford/vose

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vose-0.2.3-cp313-cp313-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for vose-0.2.3-cp313-cp313-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 98e590737476f661c16c5de3d22f62002f449347b476900bee644cf90ef5597c
MD5 9cf6092cdaec15861c00d0d135b33e04
BLAKE2b-256 5e8afa362ef16ebce818085bcd646d54540c5bd18de59e7eab73c73d6fbf2e41

See more details on using hashes here.

Provenance

The following attestation bundles were made for vose-0.2.3-cp313-cp313-musllinux_1_2_aarch64.whl:

Publisher: publish.yml on MaxHalford/vose

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vose-0.2.3-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for vose-0.2.3-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a2c3e4dff28b6464ccb44f543729a80bdef787bdc631939b46b46462cbd5a5ea
MD5 ec5c3f416265450d5211d0e7b00cd4cf
BLAKE2b-256 e1ac371e4e37aeb00daf891a2bf8b48300b27cec29bcc02b65c3673f92cb0f2a

See more details on using hashes here.

Provenance

The following attestation bundles were made for vose-0.2.3-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:

Publisher: publish.yml on MaxHalford/vose

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vose-0.2.3-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for vose-0.2.3-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 3e099ff85ecb52eda12d98abfcf959f30676dcfc8be89626dc952454f029e542
MD5 0d3733082b62c0e4dfbb3a01924deccd
BLAKE2b-256 f45bbb98a1a2094e6aea6eb54fde9fb49950b1a69b43cbdfe6d393dc4e47116d

See more details on using hashes here.

Provenance

The following attestation bundles were made for vose-0.2.3-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl:

Publisher: publish.yml on MaxHalford/vose

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vose-0.2.3-cp313-cp313-macosx_10_13_universal2.whl.

File metadata

File hashes

Hashes for vose-0.2.3-cp313-cp313-macosx_10_13_universal2.whl
Algorithm Hash digest
SHA256 b40075a756f6e18fc97a66c11893139bb52ac5b87d92a2393a74bb8d683075b3
MD5 e9b3e2bf4abd4998690e66f7f0feb25c
BLAKE2b-256 77bba5df030c41b58736e2b4adca66a9f41139d2630efc57e3ee0fb1f5fa5a11

See more details on using hashes here.

Provenance

The following attestation bundles were made for vose-0.2.3-cp313-cp313-macosx_10_13_universal2.whl:

Publisher: publish.yml on MaxHalford/vose

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vose-0.2.3-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: vose-0.2.3-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 248.5 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vose-0.2.3-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 b4d7cddbbc0440a5eff4e9d04862f52cd3fe7b321aa380b138ee825e8947c3c2
MD5 d853abdad4b2fb8e56ffc561ee98bb55
BLAKE2b-256 11d13b7752dcbcfc44e55dc6a56a0508bb57cf8024aea011f3f0ca04fdc646ce

See more details on using hashes here.

Provenance

The following attestation bundles were made for vose-0.2.3-cp312-cp312-win_amd64.whl:

Publisher: publish.yml on MaxHalford/vose

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vose-0.2.3-cp312-cp312-win32.whl.

File metadata

  • Download URL: vose-0.2.3-cp312-cp312-win32.whl
  • Upload date:
  • Size: 235.5 kB
  • Tags: CPython 3.12, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vose-0.2.3-cp312-cp312-win32.whl
Algorithm Hash digest
SHA256 d506ec60f86f34c2fdfd6100268c192038487a482f58ebfebef1679c2d1aa1af
MD5 3ed16d3dfaaf1ec741487e61302a0787
BLAKE2b-256 99dadf6c0f5086f7d159e4993d796c4a583f7905e19eb0df1e027393f2ba07e8

See more details on using hashes here.

Provenance

The following attestation bundles were made for vose-0.2.3-cp312-cp312-win32.whl:

Publisher: publish.yml on MaxHalford/vose

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vose-0.2.3-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for vose-0.2.3-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 6bfc6d0948b53e7964a949124583cb45cf2a354409ed13b66c698dbc82059e05
MD5 cb6b52c4e79b2fe2f3d28b874aacfe36
BLAKE2b-256 223d2e291cad78fa38606884f89e0d80fe86e67ee6cdada38e292af907203699

See more details on using hashes here.

Provenance

The following attestation bundles were made for vose-0.2.3-cp312-cp312-musllinux_1_2_x86_64.whl:

Publisher: publish.yml on MaxHalford/vose

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vose-0.2.3-cp312-cp312-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for vose-0.2.3-cp312-cp312-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 d3bbdc438e942065a88c2777b03b02fe0fd9a311f77add3d9eddf9b9071b088a
MD5 6cfe47225b76d71336556435b21ab726
BLAKE2b-256 ab9e75b7754694dc0ccf100e761fae02eca705e1b35b3330676814d0ffb3605f

See more details on using hashes here.

Provenance

The following attestation bundles were made for vose-0.2.3-cp312-cp312-musllinux_1_2_aarch64.whl:

Publisher: publish.yml on MaxHalford/vose

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vose-0.2.3-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for vose-0.2.3-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 539ecbd5f4f75000243643aadf37811f152dba6a4b38f4560d579cbf5ce38b92
MD5 8510e30d74da430c7c15e68595e16a78
BLAKE2b-256 494b23fa3a70e269f059e113618a30dd99157173084b9a2f1082722c78878262

See more details on using hashes here.

Provenance

The following attestation bundles were made for vose-0.2.3-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:

Publisher: publish.yml on MaxHalford/vose

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vose-0.2.3-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for vose-0.2.3-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 33a930a857975704ed953655f59963d0b7bd7ca2ba9c5c56e1a9a0597d206621
MD5 ead052b2757a3fc981b08a7101620ca0
BLAKE2b-256 79ceb9967cb2097a56197b86fd8661631e0cdc8fea05b49d404e19f2b1f107f6

See more details on using hashes here.

Provenance

The following attestation bundles were made for vose-0.2.3-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl:

Publisher: publish.yml on MaxHalford/vose

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vose-0.2.3-cp312-cp312-macosx_10_13_universal2.whl.

File metadata

File hashes

Hashes for vose-0.2.3-cp312-cp312-macosx_10_13_universal2.whl
Algorithm Hash digest
SHA256 818f2298ba844e5b75befb82a1ef9dcb388c124aeb64172beb7b4586812ded0c
MD5 2412bd603ba46ff3b525a934bc61864c
BLAKE2b-256 9008c5ea587d6673aa1c316ed77513cb0920ff3548e07a2aca0fa647c34bcc96

See more details on using hashes here.

Provenance

The following attestation bundles were made for vose-0.2.3-cp312-cp312-macosx_10_13_universal2.whl:

Publisher: publish.yml on MaxHalford/vose

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page