Skip to main content

A fast Count-Min Sketch data structure.

Project description

The Count–min sketch (or CM sketch) is a probabilistic sub-linear space streaming algorithm which can be used to summarize a data stream in many different ways. The algorithm was invented in 2003 by Graham Cormode and S. Muthu Muthukrishnan.

Count–min sketches are somewhat similar to Bloom filters; the main distinction is that Bloom filters represent sets, while CM sketches represent multisets and frequency tables. Spectral Bloom filters with multi-set policy, are conceptually isomorphic to the Count-Min Sketch.

This particular implementation has been optimized for speed by utilizing numpy, using the fnv64 hash function, and making use of as much of each hash as possible.

Example Usage

>>> from count_min_sketch import CountMinSketch
>>>
>>> cms = CountMinSketch(500, 500)
>>> cms['foo']
0
>>> cms['foo'] += 42
>>> cms
42

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

count_min_sketch-1.0.0.tar.gz (2.6 kB view details)

Uploaded Source

Built Distribution

count_min_sketch-1.0.0-py2.7.egg (5.6 kB view details)

Uploaded Source

File details

Details for the file count_min_sketch-1.0.0.tar.gz.

File metadata

File hashes

Hashes for count_min_sketch-1.0.0.tar.gz
Algorithm Hash digest
SHA256 d829886b8392abe81525fdf91575d271afcea8e6c0b157d63538d18f6cdf82fd
MD5 90efdad0d9389372966695c57bae3f38
BLAKE2b-256 05c8ea8af1c83fc6feb4dbdcb4975406973701f04bf8f2f617ae966500adcb3b

See more details on using hashes here.

File details

Details for the file count_min_sketch-1.0.0-py2.7.egg.

File metadata

File hashes

Hashes for count_min_sketch-1.0.0-py2.7.egg
Algorithm Hash digest
SHA256 3a35232fb7b9093f4087907237ed3150563fcdd2915e3e5d136d0941bb04a724
MD5 62a92e99fb689ec2385e1e52ecff39d0
BLAKE2b-256 0dc5819337c2c8bab09ef381733b64bcfc5428f9199f4bd8954880b75687ad53

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page