Skip to main content

Simple Bloom Filter implmentation in Python

Project description

Bloom Filter

Implemented in Python 3.

  • The price we pay for efficiency through bloom filters is that it is probabilistic in nature that means, there might be some False Positive results. False positive means, it might tell that given username is already taken but actually it’s not.
  • Not being False Negative such that telling that username doesn't exist while it is there, i.e., if exists it reports it's existenece in terms of maybe, else if not present it is 100% confident to report the same.
  • Deleting elements from filter is not possible because, if we delete a single element by clearing bits at indices generated by k hash functions, it might cause deletion of few other elements.

Installation

pip install bloomf==0.2

Distributed as a PyPi Package.

Usage

You can use this bloom filter as follows -

from bloomf import BloomFilter

n = 10  # number of items to be added
p = 0.04  # FP Probablity

filter = BloomFilter(n, p)

print("Size of bit array: {}" . format(filter.size))
print("False positive Probability: {}" . format(filter.fp_prob))
print("Number of hash functions: {}" . format(filter.hash_count))

word_present = ['abound', 'abounds', 'abundance', 'abundant', 'accessable', 'bloom', 'blossom', 'bolster', 'bonny', 'bonus', 'bonuses']
word_absent = ['bluff', 'cheater', 'hate', 'war', 'humanity', 'racism', 'hurt', 'facebook', 'sambhav', 'twitter']

for i in word_present:
    filter.add(i)

test_words = word_present[:5] + word_absent

for word in test_words:
    if filter.check(word):
        if word in word_absent:
            print("'{}' is a false positive!" . format(word))
        else:
            print("'{}' is a probably present!" . format(word))
    else:
        print("'{}' is 100% not present!" . format(word))

Dependencies

  • bitarray
  • mmh3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
bloomf-0.3-py3-none-any.whl (3.7 kB) Copy SHA256 hash SHA256 Wheel py3
bloomf-0.3.tar.gz (2.8 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page