A fast CMS streaming data structure implemented in rust
Project description
count-min-sketch
A simple CMS datstructure in rust. CMS can be created with explicit parameters, with tolerance and probability, or by copying an existing CMS to enable combinations.
Each CMS provides methods to insert and retrieve counts, combine with another CMS, and to multiply all entries by a constant.
The python interface provides access to the new_with_probs constructor which builds a CMS with the methods insert, retrieve, clear, and scale. A python CMS can store any object which implements hash()
>>> import count_min_sketch
>>> cms1 = count_min_sketch.CMS(0.1,0.001,10000)
>>> cms1.insert("test")
1
>>> cms1.insert(54)
1
>>> cms1.insert(54)
2
>>> cms1.retrieve(54)
2
A CMS basis hash set can be cloned allowing it to be combined with an existing CMS. This is useful to maintain multiple sets of counts and periodically combine them. This method will throw an error if they did not originate as clones.
>>> cms2 = count_min_sketch.clone_cms(cms1)
>>> cms2.combine(cms1)
>>> cms2.retrieve("test")
1
>>> cms2.retrieve(54)
2
>>>
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for count_min_sketch_rs-1.0.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 43468c1a92518825f9ada972e5060b0278a16c95a54bddd9f10a3f103e9714b7 |
|
MD5 | 2677d5d041c79c4f29d9d38096e2c6aa |
|
BLAKE2b-256 | ff75541345d3a02fbf89dab72165c7acea7d25e9b6d4121a7e7ccafea2bdb68e |
Hashes for count_min_sketch_rs-1.0.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | adb2dea4b01b69de2a50758b4bd3e716ca46e2e2040baae0922cc4dc034204b8 |
|
MD5 | 70443a37ef5e2bd8bae0b5e7c5de105a |
|
BLAKE2b-256 | 702af2b9f7bb5b3d4a24916dc54cce7fff56dab47eae6a83b1331216679b54c0 |