Skip to main content

Cuckoo Filter implementation in Python

Project description

https://img.shields.io/pypi/v/cuckoopy.svg https://img.shields.io/pypi/l/cuckoopy.svg https://img.shields.io/pypi/wheel/cuckoopy.svg https://img.shields.io/pypi/pyversions/cuckoopy.svg https://travis-ci.org/rajathagasthya/cuckoopy.svg?branch=master

Cuckoo Filter, like Bloom Filter, is a probabilistic data structure for fast, approximate set membership queries, with some small false positive probability. While Bloom Filters are space efficient and are widely used, they do not support deletion of items from the set without rebuilding the entire filter. This can be overcome with several extensions to Bloom Filters such as Counting Bloom Filters, but with significant space overhead.

Cuckoo Filters support adding and removing items dynamically while achieving higher performance than Bloom filters. A Cuckoo Filter is based on partial-key cuckoo hashing that stores only fingerprint of each item inserted. Cuckoo Filters provide higher lookup performance than Bloom Filters and uses less space than Bloom Filters if the target false positive rate is < 3%.

The original research paper Cuckoo Filter: Practically Better Than Bloom by Bin Fan, David G. Andersen, Michael Kaminsky and Michael D. Mitzenmacher describes the data structure in more detail.

Installation

Make sure you have Python (3.5+) installed on your system. If you don’t have it, follow these instructions to install it.

Install cuckoopy using:

$ pip install cuckoopy

Usage

>>> from cuckoopy import CuckooFilter
# Initialize a cuckoo filter with 10000 buckets with bucket size 4 and fingerprint size of 1 byte
>>> cf = CuckooFilter(capacity=10000, bucket_size=4, fingerprint_size=1)

Insert an item into the filter:

>>> cf.insert('Hello!')
True

Lookup an item in the filter:

>>> cf.contains('Hello!')
True
>>> 'Hello!' in cf
True

Delete an item from the filter:

>>> cf.delete('Hello!')
True

Get the size (number of items present) of the filter:

>>> cf.size
4
>>> len(cf)
4

Running tests locally

This project uses pytest for tests. Make sure you have tox installed on your local machine and from the root directory of the project, run:

$ tox

This command runs unit tests in python 3.5 and python 3.6 environments with code coverage details. It also runs pep8 (flake8) checks. To run tox against a specific environment (py35, py36 or pep8), use the -e option.

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cuckoopy-0.1.1.tar.gz (5.6 kB view details)

Uploaded Source

Built Distribution

cuckoopy-0.1.1-py3-none-any.whl (7.9 kB view details)

Uploaded Python 3

File details

Details for the file cuckoopy-0.1.1.tar.gz.

File metadata

  • Download URL: cuckoopy-0.1.1.tar.gz
  • Upload date:
  • Size: 5.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for cuckoopy-0.1.1.tar.gz
Algorithm Hash digest
SHA256 f89e64ce38414f98c7e032a7590bffd540e422798270f839889e0e479018a8b2
MD5 cad3533bd02d7c64d85c01b11f936479
BLAKE2b-256 8e4165cdc4812f7771c0ee3618ea3f676cadab78261f1eef132f3af4a930f35b

See more details on using hashes here.

File details

Details for the file cuckoopy-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for cuckoopy-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7c9a13420753d357df7ced76e8ecc11da1d2e91ede7a546f4ecc6d3de2b50746
MD5 b945723575f7017b15c6cd9941c04a5d
BLAKE2b-256 37444ce1f48668c48808e5458d16f3e771146c3d32dfb62d8b5d4a355dd2a9e4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page