Skip to main content

Simple Bloom Filter implmentation in Python

Project description

Bloom Filter

Implemented in Python 3.

  • The price we pay for efficiency through bloom filters is that it is probabilistic in nature that means, there might be some False Positive results. False positive means, it might tell that given username is already taken but actually it’s not.
  • Not being False Negative such that telling that username doesn't exist while it is there, i.e., if exists it reports it's existenece in terms of maybe, else if not present it is 100% confident to report the same.
  • Deleting elements from filter is not possible because, if we delete a single element by clearing bits at indices generated by k hash functions, it might cause deletion of few other elements.

Installation

pip install bloomf==0.2

Distributed as a PyPi Package.

Usage

You can use this bloom filter as follows -

from bloomf import BloomFilter

n = 10  # number of items to be added
p = 0.04  # FP Probablity

filter = BloomFilter(n, p)

print("Size of bit array: {}" . format(filter.size))
print("False positive Probability: {}" . format(filter.fp_prob))
print("Number of hash functions: {}" . format(filter.hash_count))

word_present = ['abound', 'abounds', 'abundance', 'abundant', 'accessable', 'bloom', 'blossom', 'bolster', 'bonny', 'bonus', 'bonuses']
word_absent = ['bluff', 'cheater', 'hate', 'war', 'humanity', 'racism', 'hurt', 'facebook', 'sambhav', 'twitter']

for i in word_present:
    filter.add(i)

test_words = word_present[:5] + word_absent

for word in test_words:
    if filter.check(word):
        if word in word_absent:
            print("'{}' is a false positive!" . format(word))
        else:
            print("'{}' is a probably present!" . format(word))
    else:
        print("'{}' is 100% not present!" . format(word))

Dependencies

  • bitarray
  • mmh3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for bloomf, version 0.3
Filename, size File type Python version Upload date Hashes
Filename, size bloomf-0.3-py3-none-any.whl (3.7 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size bloomf-0.3.tar.gz (2.8 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page