Probabilistic data structures

Probabilistic Structures

Probstructs is easy to use Python wrapper for C++ library probstructs . It supports Exponential Histograms, Count Min Sketch (CM-Sketch), and Exponential Count Min Sketch (ECM-Sketch).

Installation

With pip:

pip install probstructs

From source:

pip install .

Classes

CountMinSketch

Count–min sketch (CM sketch) is a probabilistic data structure that serves as a frequency table of events in a stream of data. It uses hash functions to map events to frequencies, but unlike a hash table uses only sub-linear space, at the expense of overcounting some events due to collisions.

from probstructs import CountMinSketch

cm_sketch = CountMinSketch(100, 4)
cm_sketch.inc("aaa", 1)
cm_sketch.inc("bbb", 5)
cm_sketch.inc("aaa", 2)

print(cm_sketch.get("aaa"))
# 3
print(cm_sketch.get("bbb"))
# 5
print(cm_sketch.get("ccc"))
# 0

cm_sketch = CountMinSketch(width=100, depth=4)
cm_sketch.inc(key="bbb", delta=5)
print(cm_sketch.get(key="bbb"))
# 5

ExponentialHistorgram

Exponential histogram (EH) is a probabilistic data structure that serves as a frequency counter for specific elements in the last N elements from stream..

from probstructs import ExponentialHistorgram

eh = ExponentialHistorgram(1)
eh.inc(1, 1)
print(eh.get(1, 1))
# 1
eh.inc(1, 1)
print(eh.get(1, 1))
# 2
eh.inc(2, 1)
print(eh.get(1, 2))
# 1

eh = ExponentialHistorgram(window=1)
eh.inc(tick=1, delta=1)
print(eh.get(window=1, tick=1))
# 1
eh.inc(tick=1, delta=1)
print(eh.get(window=1, tick=1))
# 2
eh.inc(tick=2, delta=1)
print(eh.get(window=1, tick=2))
# 1

ExponentialCountMinSketch

Exponential count-min sketch (ECM-Sketch) combines CM-Sketch with EH to count number of different elements in the last N elements in the stream.

from probstructs import ExponentialCountMinSketch

ecm_sketch = ExponentialCountMinSketch(100, 4, 8)

ts = 0
ecm_sketch.inc("aaa", ts, 1)
ecm_sketch.inc("bbb", ts, 4)
ecm_sketch.inc("ccc", ts, 8)

print(ecm_sketch.get("aaa", 4, ts))
# 1
print(ecm_sketch.get("bbb", 4, ts))
# 4
print(ecm_sketch.get("ccc", 4, ts))
# 8
print(ecm_sketch.get("ddd", 4, ts))
# 0

ecm_sketch = ExponentialCountMinSketch(width=100, depth=4, window=8)

ts = 0
ecm_sketch.inc(key="aaa", tick=ts, delta=1)
ecm_sketch.inc(key="bbb", tick=ts, delta=4)
ecm_sketch.inc(key="ccc", tick=ts, delta=8)

print(ecm_sketch.get(key="aaa", window=4, tick=ts))
# 1
print(ecm_sketch.get(key="bbb", window=4, tick=ts))
# 4
print(ecm_sketch.get(key="ccc", window=4, tick=ts))
# 8
print(ecm_sketch.get(key="ddd", window=4, tick=ts))
# 0

Changelog

0.2.0

• Introduce named parameters

• Update documentation to contain examples

0.1.0

• Initial version

Project details

Source Distribution

probstructs-0.2.6.tar.gz (7.2 kB view hashes)

Uploaded source

Built Distributions

probstructs-0.2.6-pp37-pypy37_pp73-win32.whl (53.6 kB view hashes)

Uploaded pp37

probstructs-0.2.6-pp37-pypy37_pp73-manylinux2010_x86_64.whl (537.4 kB view hashes)

Uploaded pp37

probstructs-0.2.6-pp37-pypy37_pp73-manylinux1_x86_64.whl (537.4 kB view hashes)

Uploaded pp37

probstructs-0.2.6-pp37-pypy37_pp73-macosx_10_9_x86_64.whl (68.3 kB view hashes)

Uploaded pp37

probstructs-0.2.6-pp36-pypy36_pp73-win32.whl (53.6 kB view hashes)

Uploaded pp36

probstructs-0.2.6-pp36-pypy36_pp73-manylinux2010_x86_64.whl (537.3 kB view hashes)

Uploaded pp36

probstructs-0.2.6-pp36-pypy36_pp73-manylinux1_x86_64.whl (537.3 kB view hashes)

Uploaded pp36

probstructs-0.2.6-pp36-pypy36_pp73-macosx_10_9_x86_64.whl (68.2 kB view hashes)

Uploaded pp36

probstructs-0.2.6-pp27-pypy_73-win32.whl (54.7 kB view hashes)

Uploaded pp27

probstructs-0.2.6-pp27-pypy_73-manylinux2010_x86_64.whl (537.5 kB view hashes)

Uploaded pp27

probstructs-0.2.6-pp27-pypy_73-manylinux1_x86_64.whl (537.5 kB view hashes)

Uploaded pp27

probstructs-0.2.6-pp27-pypy_73-macosx_10_9_x86_64.whl (68.6 kB view hashes)

Uploaded pp27

probstructs-0.2.6-cp39-cp39-win_amd64.whl (60.7 kB view hashes)

Uploaded cp39

probstructs-0.2.6-cp39-cp39-win32.whl (54.4 kB view hashes)

Uploaded cp39

probstructs-0.2.6-cp39-cp39-manylinux2010_x86_64.whl (1.3 MB view hashes)

Uploaded cp39

probstructs-0.2.6-cp39-cp39-manylinux2010_i686.whl (1.4 MB view hashes)

Uploaded cp39

probstructs-0.2.6-cp39-cp39-manylinux1_x86_64.whl (1.3 MB view hashes)

Uploaded cp39

probstructs-0.2.6-cp39-cp39-manylinux1_i686.whl (1.4 MB view hashes)

Uploaded cp39

probstructs-0.2.6-cp39-cp39-macosx_10_9_x86_64.whl (76.4 kB view hashes)

Uploaded cp39

probstructs-0.2.6-cp38-cp38-win_amd64.whl (60.6 kB view hashes)

Uploaded cp38

probstructs-0.2.6-cp38-cp38-win32.whl (54.3 kB view hashes)

Uploaded cp38

probstructs-0.2.6-cp38-cp38-manylinux2010_x86_64.whl (1.3 MB view hashes)

Uploaded cp38

probstructs-0.2.6-cp38-cp38-manylinux2010_i686.whl (1.4 MB view hashes)

Uploaded cp38

probstructs-0.2.6-cp38-cp38-manylinux1_x86_64.whl (1.3 MB view hashes)

Uploaded cp38

probstructs-0.2.6-cp38-cp38-manylinux1_i686.whl (1.4 MB view hashes)

Uploaded cp38

probstructs-0.2.6-cp38-cp38-macosx_10_9_x86_64.whl (76.2 kB view hashes)

Uploaded cp38

probstructs-0.2.6-cp37-cp37m-win_amd64.whl (61.3 kB view hashes)

Uploaded cp37

probstructs-0.2.6-cp37-cp37m-win32.whl (55.1 kB view hashes)

Uploaded cp37

probstructs-0.2.6-cp37-cp37m-manylinux2010_x86_64.whl (1.4 MB view hashes)

Uploaded cp37

probstructs-0.2.6-cp37-cp37m-manylinux2010_i686.whl (1.4 MB view hashes)

Uploaded cp37

probstructs-0.2.6-cp37-cp37m-manylinux1_x86_64.whl (1.4 MB view hashes)

Uploaded cp37

probstructs-0.2.6-cp37-cp37m-manylinux1_i686.whl (1.4 MB view hashes)

Uploaded cp37

probstructs-0.2.6-cp37-cp37m-macosx_10_9_x86_64.whl (75.4 kB view hashes)

Uploaded cp37

probstructs-0.2.6-cp36-cp36m-win_amd64.whl (61.3 kB view hashes)

Uploaded cp36

probstructs-0.2.6-cp36-cp36m-win32.whl (55.1 kB view hashes)

Uploaded cp36

probstructs-0.2.6-cp36-cp36m-manylinux2010_x86_64.whl (1.4 MB view hashes)

Uploaded cp36

probstructs-0.2.6-cp36-cp36m-manylinux2010_i686.whl (1.4 MB view hashes)

Uploaded cp36

probstructs-0.2.6-cp36-cp36m-manylinux1_x86_64.whl (1.4 MB view hashes)

Uploaded cp36

probstructs-0.2.6-cp36-cp36m-manylinux1_i686.whl (1.4 MB view hashes)

Uploaded cp36

probstructs-0.2.6-cp36-cp36m-macosx_10_9_x86_64.whl (75.4 kB view hashes)

Uploaded cp36

probstructs-0.2.6-cp35-cp35m-win_amd64.whl (61.3 kB view hashes)

Uploaded cp35

probstructs-0.2.6-cp35-cp35m-win32.whl (55.1 kB view hashes)

Uploaded cp35

probstructs-0.2.6-cp35-cp35m-manylinux2010_x86_64.whl (1.4 MB view hashes)

Uploaded cp35

probstructs-0.2.6-cp35-cp35m-manylinux2010_i686.whl (1.4 MB view hashes)

Uploaded cp35

probstructs-0.2.6-cp35-cp35m-manylinux1_x86_64.whl (1.4 MB view hashes)

Uploaded cp35

probstructs-0.2.6-cp35-cp35m-manylinux1_i686.whl (1.4 MB view hashes)

Uploaded cp35

probstructs-0.2.6-cp35-cp35m-macosx_10_9_x86_64.whl (75.4 kB view hashes)

Uploaded cp35

probstructs-0.2.6-cp27-cp27mu-manylinux2010_x86_64.whl (1.4 MB view hashes)

Uploaded cp27

probstructs-0.2.6-cp27-cp27mu-manylinux2010_i686.whl (1.4 MB view hashes)

Uploaded cp27

probstructs-0.2.6-cp27-cp27mu-manylinux1_x86_64.whl (1.4 MB view hashes)

Uploaded cp27

probstructs-0.2.6-cp27-cp27mu-manylinux1_i686.whl (1.4 MB view hashes)

Uploaded cp27

probstructs-0.2.6-cp27-cp27m-manylinux2010_x86_64.whl (1.4 MB view hashes)

Uploaded cp27

probstructs-0.2.6-cp27-cp27m-manylinux2010_i686.whl (1.4 MB view hashes)

Uploaded cp27

probstructs-0.2.6-cp27-cp27m-manylinux1_x86_64.whl (1.4 MB view hashes)

Uploaded cp27

probstructs-0.2.6-cp27-cp27m-manylinux1_i686.whl (1.4 MB view hashes)

Uploaded cp27

probstructs-0.2.6-cp27-cp27m-macosx_10_9_x86_64.whl (76.0 kB view hashes)

Uploaded cp27