Skip to main content

Simple Bloom Filter implmentation in Python

Project description

Bloom Filter

Implemented in Python 3.

  • The price we pay for efficiency through bloom filters is that it is probabilistic in nature that means, there might be some False Positive results. False positive means, it might tell that given username is already taken but actually it’s not.
  • Not being False Negative such that telling that username doesn't exist while it is there, i.e., if exists it reports it's existenece in terms of maybe, else if not present it is 100% confident to report the same.
  • Deleting elements from filter is not possible because, if we delete a single element by clearing bits at indices generated by k hash functions, it might cause deletion of few other elements.

Installation

pip install bloomf==0.2

Distributed as a PyPi Package.

Usage

You can use this bloom filter as follows -

from bloomf import BloomFilter

n = 10  # number of items to be added
p = 0.04  # FP Probablity

filter = BloomFilter(n, p)

print("Size of bit array: {}" . format(filter.size))
print("False positive Probability: {}" . format(filter.fp_prob))
print("Number of hash functions: {}" . format(filter.hash_count))

word_present = ['abound', 'abounds', 'abundance', 'abundant', 'accessable', 'bloom', 'blossom', 'bolster', 'bonny', 'bonus', 'bonuses']
word_absent = ['bluff', 'cheater', 'hate', 'war', 'humanity', 'racism', 'hurt', 'facebook', 'sambhav', 'twitter']

for i in word_present:
    filter.add(i)

test_words = word_present[:5] + word_absent

for word in test_words:
    if filter.check(word):
        if word in word_absent:
            print("'{}' is a false positive!" . format(word))
        else:
            print("'{}' is a probably present!" . format(word))
    else:
        print("'{}' is 100% not present!" . format(word))

Dependencies

  • bitarray
  • mmh3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bloomf-0.3.tar.gz (2.8 kB view hashes)

Uploaded Source

Built Distribution

bloomf-0.3-py3-none-any.whl (3.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page