Sumtrees for NumPy arrays

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Build License

STArr

Fast sum tree ops in Cython for NumPy arrays. Inspired by Prioritized Experience Replay.

Installation

pip install starr

Quickstart

Initialize a SumTreeArray, a subclass of numpy.ndarray

>>> from starr import SumTreeArray
>>> sumtree_array = SumTreeArray(4, dtype='float32')
>>> sumtree_array
SumTreeArray([0., 0., 0., 0.], dtype=float32)

Or build one from an existing n-dimensional ndarray

>>> import numpy as np
>>> sumtree_array_2d = SumTreeArray(np.array([[1,2,3],[4,5,6]], dtype='int32'))
>>> sumtree_array_2d
SumTreeArray([[1, 2, 3],
              [4, 5, 6]], dtype=int32)

Set values like you normally would

>>> sumtree_array[0] = 1
>>> sumtree_array[1:2] = [2]
>>> sumtree_array[np.array([False,False,True,False])] = 3
>>> sumtree_array[-1] = 4
>>> sumtree_array
SumTreeArray([1., 2., 3., 4.], dtype=float32)

A SumTreeArray maintains an internal sum tree, which can be used for fast sampling and sum ops.

>>> sumtree_array.sumtree()
array([ 0., 10.,  3.,  7.], dtype=float32)

Sample indices (efficiently), where each element is the unnormalized probability of being sampled

>>> sumtree_array.sample(10)
array([2, 3, 3, 3, 3, 1, 2, 2, 2, 0], dtype=int32)

>>> # probability of being sampled
>>> sumtree_array / sumtree_array.sum() 
array([0.1, 0.2, 0.3, 0.4], dtype=float32)

>>> # sampled proportions
>>> (sumtree_array.sample(1000)[None] == np.arange(4)[:,None]).mean(axis=1) 
array([0.10057, 0.19919, 0.29983, 0.40041])

You can also sample indices from an n-dimensional SumTreeArray

>>> sumtree_array_2d.sample(4)
(array([1, 1, 0, 0]), array([0, 1, 1, 2]))

Use the array's sum method to use the sumtree to calculate sums (quickly)

>>> sumtree_array.sum()
10.0

Memory

Arithmetic operations return ndarray (to avoid expensive tree initialization)

>>> sumtree_array * 2
array([ 2., 4., 6., 8.], dtype=float32)

This is true for get operations as well

>>> sumtree_array[1:3]
array([2., 3.], dtype=float32)

>>> sumtree_array[:]
array([1., 2., 3., 4.], dtype=float32)

However, in-place operations update SumTreeArray

>>> sumtree_array_in_place_op = SumTreeArray(np.array([2,4,6,8]),dtype='float32')
>>> sumtree_array_in_place_op += 1
>>> sumtree_array_in_place_op 
SumTreeArray([3., 5., 7., 9.], dtype=float32)

Performance

See latest benchmarks.

Sampling indices is faster than normal sampling methods in numpy

>>> x = SumTreeArray(np.ones(int(1e6)))
>>> %timeit x.sample(100)
55.2 µs ± 6.17 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

>>> y = np.ones(int(1e6))
>>> %timeit np.random.choice(len(y),size=100,p=y/y.sum())
10.8 ms ± 697 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

For large arrays, sum operations over C-contiguous blocks of memory are faster than ndarray, because of the sum tree:

>>> x = SumTreeArray(np.ones((1000,1000)))
>>> %timeit x.sum()
428 ns ± 10.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

>>> y = np.ones((1000,1000))
>>> %timeit y.sum()
272 µs ± 51.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

>>> %timeit x.sum(axis=1)
118 µs ± 2.2 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

>>> %timeit y.sum(axis=1)
276 µs ± 68.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Sum operations over non C-contiguous blocks of memory (e.g. along the first axis of a 2d array) are slower:

>>> %timeit x.sum(axis=0)
367 µs ± 28 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

>>> %timeit y.sum(axis=0)
303 µs ± 6.97 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Set operations are much slower in SumTreeArray than in ndarray, because each set operation updates the tree, but that's okay when using SumTreeArray for applications that rely heavily on sampling and sum operations, such as prioritzed experience replay! In the example below, updating and sampling with SumTreeArray is 150x faster than with ndarray, even though the update operation alone in ndarray is 26x faster than SumTreeArray!

>>> x = SumTreeArray(np.ones(int(1e6)))

>>> # set + sample 
>>> %timeit x[-10:] = 2; x.sample(100)
71.4 µs ± 3.71 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

>>> y = np.ones(int(1e6))
>>> y_sum = y.sum() # let's assume we keep track of this efficiently

>>> # set + sample 
>>> %timeit y[-10:] = 2; np.random.choice(len(y),size=100,p=y/y_sum)
10.7 ms ± 752 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.3.2

Jan 16, 2024

0.2.1

Jul 26, 2021

0.1.0

Jul 16, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

starr-0.3.2.tar.gz (14.9 kB view hashes)

Uploaded Jan 16, 2024 Source

Built Distribution

starr-0.3.2-cp36-cp36m-macosx_10_9_x86_64.whl (228.3 kB view hashes)

Uploaded Jan 16, 2024 CPython 3.6m macOS 10.9+ x86-64

Hashes for starr-0.3.2.tar.gz

Hashes for starr-0.3.2.tar.gz
Algorithm	Hash digest
SHA256	`a0ad82a79b2bb2d25b0954de4c9e5411c75ad8aff41a156ee43f47e742b6647e`
MD5	`8dbdc69fd4485cd4dce487e41364c98b`
BLAKE2b-256	`cb626e54153a113d1629eab0a470a372ec785e7e52df85f557dc2c224a5c0c07`

Hashes for starr-0.3.2-cp36-cp36m-macosx_10_9_x86_64.whl

Hashes for starr-0.3.2-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm	Hash digest
SHA256	`be5af34b78618da880c956437a2c51595b259c2abe85d5233e3310b6318205a4`
MD5	`e2a1ad8b482c95151c2f99aa05e98ccb`
BLAKE2b-256	`54ac4ff9cb90e43157f7d2f8d5d5095e05c03393503f1f3e730927bb49391923`