Skip to main content

Simple Bloom Filter implmentation in Python

Project description

Bloom Filter

Implemented in Python 3.

  • The price we pay for efficiency through bloom filters is that it is probabilistic in nature that means, there might be some False Positive results. False positive means, it might tell that given username is already taken but actually it’s not.
  • Not being False Negative such that telling that username doesn't exist while it is there, i.e., if exists it reports it's existenece in terms of maybe, else if not present it is 100% confident to report the same.
  • Deleting elements from filter is not possible because, if we delete a single element by clearing bits at indices generated by k hash functions, it might cause deletion of few other elements.

Installation

pip install bloomf==0.2

Distributed as a PyPi Package.

Usage

You can use this bloom filter as follows -

from bloomf import BloomFilter

n = 10  # number of items to be added
p = 0.04  # FP Probablity

filter = BloomFilter(n, p)

print("Size of bit array: {}" . format(filter.size))
print("False positive Probability: {}" . format(filter.fp_prob))
print("Number of hash functions: {}" . format(filter.hash_count))

word_present = ['abound', 'abounds', 'abundance', 'abundant', 'accessable', 'bloom', 'blossom', 'bolster', 'bonny', 'bonus', 'bonuses']
word_absent = ['bluff', 'cheater', 'hate', 'war', 'humanity', 'racism', 'hurt', 'facebook', 'sambhav', 'twitter']

for i in word_present:
    filter.add(i)

test_words = word_present[:5] + word_absent

for word in test_words:
    if filter.check(word):
        if word in word_absent:
            print("'{}' is a false positive!" . format(word))
        else:
            print("'{}' is a probably present!" . format(word))
    else:
        print("'{}' is 100% not present!" . format(word))

Dependencies

  • bitarray
  • mmh3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bloomf-0.3.tar.gz (2.8 kB view details)

Uploaded Source

Built Distribution

bloomf-0.3-py3-none-any.whl (3.7 kB view details)

Uploaded Python 3

File details

Details for the file bloomf-0.3.tar.gz.

File metadata

  • Download URL: bloomf-0.3.tar.gz
  • Upload date:
  • Size: 2.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for bloomf-0.3.tar.gz
Algorithm Hash digest
SHA256 be3f1a00814e327034c2125da71230a785577280e3c76c977a74df34c0e0c937
MD5 3b8bf48e479969c988051768470a4aaa
BLAKE2b-256 e729be629fb306722944f6f599993a49545c10e77e2073159a1c307e58444b62

See more details on using hashes here.

File details

Details for the file bloomf-0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for bloomf-0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 ef97b6d720241d482dbd8bbf433e7a99f91589d9f5afeee82e35050fca5e3db0
MD5 24425c4f27aab08c6fe7582e0b2437d0
BLAKE2b-256 888f408cf8f074034f91e3f40512e8abb2fb694ae73b886f201487875715b139

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page