Skip to main content

A fast Count-Min Sketch data structure.

Project description

The Count–min sketch (or CM sketch) is a probabilistic sub-linear space streaming algorithm which can be used to summarize a data stream in many different ways. The algorithm was invented in 2003 by Graham Cormode and S. Muthu Muthukrishnan.

Count–min sketches are somewhat similar to Bloom filters; the main distinction is that Bloom filters represent sets, while CM sketches represent multisets and frequency tables. Spectral Bloom filters with multi-set policy, are conceptually isomorphic to the Count-Min Sketch.

This particular implementation has been optimized for speed by utilizing numpy, using the fnv64 hash function, and making use of as much of each hash as possible.

Example Usage

>>> cms = CountMinSketch(200, 500)
>>> cms['foo']
0
>>> cms['foo'] += 5
>>> cms['foo']
5

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

count_min_sketch-1.0.1.tar.gz (3.6 kB view details)

Uploaded Source

Built Distribution

count_min_sketch-1.0.1-py2.7.egg (7.8 kB view details)

Uploaded Source

File details

Details for the file count_min_sketch-1.0.1.tar.gz.

File metadata

File hashes

Hashes for count_min_sketch-1.0.1.tar.gz
Algorithm Hash digest
SHA256 91bd9b4f72f66f549ae4062492baf7276761c2566473dec812a35206d62fd731
MD5 b144723797a7c45521308246cf3443cd
BLAKE2b-256 fecd06ea6fde884b310ab8715595b1318a1613b4fcb238b841c40b4c47950eb0

See more details on using hashes here.

File details

Details for the file count_min_sketch-1.0.1-py2.7.egg.

File metadata

File hashes

Hashes for count_min_sketch-1.0.1-py2.7.egg
Algorithm Hash digest
SHA256 afe7fcc59bc516326a87a873c4fce60512f144575d5dca7e065e2a333e9bef9d
MD5 7b24db1ee8f544385f0b6b63519603b4
BLAKE2b-256 fc6809fcda6fab5ef1da8e1235a153aff53d4372196539f75a3c86967ce6a7db

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page