Skip to main content

Implemenation of shifting bloom filter in python 3

Project description

Shifting Bloom filter implementation in python3.

Brief description

Bloom filter is a probabilistic test data structure that returns False if an object is not part of a set. When bloom filter returns True an object might or might not be part of a set. Shifting Bloom filter is an extension on Bloom fliter that allows for multiple set or multiset usecases.

Library description

This library is made up of 4 submodules, namely the ShiftingBloomFilter which contains the Shifting Bloom filter it self, utils which contains a set of utilities that are useful while using the set, exceptions which contain the exceptions associated with the bloom filter and visualiser which contains a graphic tool that can be used to inspect the filter.

API description

ShiftingBloomFilter

The filter supports following built-in methods: len(), bool() (True if not empty), str(), repr(), obj[index]

name type arguments description
MULTISET constant N/A constant value for initialising the filter to be used with multiset
MULTIPLE constant N/A constant value for initialising the filter to be used with multiple sets
ShiftingBloomFilter(length) class length, hash_count, hash_source, mode, set_count bloom filter with support for handling multisets or multiplesets
length the size of the underlying bytearrray which is used to represent the filter
hash_count=len(algorithms_guaranteed) amount of hashing functions to use. NOTE!: cannot be greater than the length of hash source
hash_source=algorithms_guaranteed a list of hashing functions to use.
length_as_power=True is length of filter expressed as power of 2 (True) or is it literal (False)
mode=MULTIPLE MULTIPLE if there are multiple sets or MULTISET if its one set but supporting multiple elements
set_count=0 how many sets is this filter suppoused to support
obj.insert(item) method item, set_no=0 insert item into the filter with set_no (applicable for multiple sets only)
obj.check(item) method item check if item is in the filter
obj.save2file() method filename=sbf.bin save filter to file (binary)
obj.get_fpr() method get false postitive rate for current state of the filter
ShiftingBloomFilter.load_from_file() static method filename=sbf.bin load filter from binary file

utils

name type arguments description
CSVDataSet(filename) class filename, separator=',' Iterative reader for csv data sets
built-ins repr(), next()
RandomStringGenerator() class string_length=4, ascii_start=32, ascii_end=126, stream_length=... a stream of random strings of given length
built-ins repr(), len(), next()
HashFunction(hash_base, salt) class hash_base, salt wrapper around salted hashing function
built-ins repr(), obj()
HashFactory(hash_family, hash_count) class hash_family, hash_count Produces a list of salted hash functions. hash_family is a base hash function from hashlib. hash_count is number of hash functions to create
obj.save2file() method filename=hashdata.bin save HashFactory object to file.
HashFactory.load_from_file() static method filename=hashdata.bin load HashFactory object from file
built-ins len(), repr(), next(), obj[index]

exceptions

name type arguments description
SBFException() Exception class Top-level module excpetion
HashesUnavailableError(message) Exception class message Exception raised when there is error related to hashing functions

ShiftingBloomFilter.visualiser

Tool for presenting how shifting bloom filter works. Can be used as debugger if needed.

name type arguments description
Main() class title=ShiftingBloomFilter Visualiser, length=25, hash_count=None, hash_source=None, bloom=None, deepcopy=True, mode=MULTIPLE, no_sets=1 Main window of Visualiser.
title title for the visualiser window
length length of the filter
hash_count number of hashing functions to use
hash_source source of hash functions
bloom use existing bloom filter instead of creating new one (useful for debugging)
deepcopy work on copy of a given bloom filter
Main.run() static method title=ShiftingBloomFilter Visualiser, length=25, mode=MULTIPLE Starts the visualiser

Project details


Release history Release notifications | RSS feed

This version

1.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shiftingbloomfilter-1.0.tar.gz (21.2 kB view details)

Uploaded Source

Built Distribution

ShiftingBloomFilter-1.0-py3-none-any.whl (21.7 kB view details)

Uploaded Python 3

File details

Details for the file shiftingbloomfilter-1.0.tar.gz.

File metadata

  • Download URL: shiftingbloomfilter-1.0.tar.gz
  • Upload date:
  • Size: 21.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.6

File hashes

Hashes for shiftingbloomfilter-1.0.tar.gz
Algorithm Hash digest
SHA256 af4a0f3f086eb23a0c5864c1f863e994a9f47f9589ba9e727d67a181c74b5740
MD5 5118b7d92e20b93a61ed18ff34bc2550
BLAKE2b-256 d93d764f4ab8394a623bbf4089a2cf5d103f406747e87534ef9c10cf957da922

See more details on using hashes here.

File details

Details for the file ShiftingBloomFilter-1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ShiftingBloomFilter-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 69b0428d885fb955bee3d8779c33fdfb023b6fb1db39f6b507a046a1e2f3b3b5
MD5 4ced717d2ea82206633528355af42fda
BLAKE2b-256 6004f822926ecdc874d81fc947f123126c0f62350a17f64dc3bfccd901fda712

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page