A fast and memory-efficient Bloom Filter implementation with memory mapping support
Project description
Simple and fast pythonic bloomfilter
From wikipedia: "A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that is used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not – in other words, a query returns either "possibly in set" or "definitely not in set". Elements can be added to the set, but not removed (though this can be addressed with a "counting" filter); the more elements that are added to the set, the larger the probability of false positives."
This filter supports:
- Saving, reloading with pickle.
- Stats
- Entropy analysis
- Internal and external hashing of data.
- raw filter merging
Installing:
sudo pip install fastbloomfilter
External creation of the bloom filter file:
python mkbloom.py /tmp/filter.blf
Importing:
>>> from fastBloomFilter import bloom
>>> bf = bloom.BloomFilter(array_size=1024**3)
Or
>>> from fastBloomFilter import bloom
>>> bf = bloom.BloomFilter(filename='/tmp/filter.blf')
Adding data to it:
>>> bf.add('30000')
>>> bf.add('1230213')
>>> bf.add('1')
Printing stats:
>>> bf.stat()
Or:
>>> bf.info()
Querying data:
>>> print(bf.query('1'))
True
>>> print(bf.query('1230213'))
True
>>> print(bf.query('12'))
False
>>> print(bf['1'])
True
Querying data and at the same time adding it:
>>> print(bf.update('1'))
False
# False means the object wasn't existing and was added.
>>> print(bf.update('1'))
True
# True means the object existed and nothing new was added.
>>> print(bf.update('2'))
False
>>> print(bf.update('2'))
True
Merging two filters:
Create first filter:
>>> from fastBloomFilter import bloom
>>> bf1 = bloom.BloomFilter(array_size=1024**3)
>>> bf1.add("1")
Create second filter:
>>> from fastBloomFilter import bloom
>>> bf2 = bloom.BloomFilter(array_size=1024**3)
>>> bf2.add("2")
Merge the two filters into a third filter:
>>> bf3 = bf1 + bf2
Check the elements in the third filter:
>>> print(bf3["1"])
True
>>> print(bf3["2"])
True
Contributing
Contributons:
Are welcome!
Criteria: - They should not include hidden folders or files of any ide environment.
- They should not delete big portions of the project.
- They should not include files that does not have anything to do with the project.
- They should not change the API. (API changes should be proposed with Issues as enhancements)
- They should not include any obfuscated code.
- They should not include binaries.
- They should be in small PRs for faster reviewing process.
- They should include a small testcase.
- Any contribution not hornoring this criteria will be rejected until it does.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fastbloomfilter-0.0.13.tar.gz.
File metadata
- Download URL: fastbloomfilter-0.0.13.tar.gz
- Upload date:
- Size: 19.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5d548c915ea5e8ce4bbe29445d123aef3562703b46529e4150dbd09bc4b21f05
|
|
| MD5 |
d7bf1ce5ab8f1806a410db2b6efb9222
|
|
| BLAKE2b-256 |
ed695ec865a3c6b679f139ae14dfc21f18a95bcf7df1a60f5754fcfe88b6800a
|
Provenance
The following attestation bundles were made for fastbloomfilter-0.0.13.tar.gz:
Publisher:
pypi-publish.yml on daedalus/fastBloomFilter
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fastbloomfilter-0.0.13.tar.gz -
Subject digest:
5d548c915ea5e8ce4bbe29445d123aef3562703b46529e4150dbd09bc4b21f05 - Sigstore transparency entry: 1178857828
- Sigstore integration time:
-
Permalink:
daedalus/fastBloomFilter@4a346598b646e43cb3258e9f44134f4cc5274966 -
Branch / Tag:
refs/tags/v0.0.13 - Owner: https://github.com/daedalus
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@4a346598b646e43cb3258e9f44134f4cc5274966 -
Trigger Event:
release
-
Statement type:
File details
Details for the file fastbloomfilter-0.0.13-py3-none-any.whl.
File metadata
- Download URL: fastbloomfilter-0.0.13-py3-none-any.whl
- Upload date:
- Size: 21.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
943997559608722ad6bce825c6990210368c6c28ccc955c39495919a747c9e96
|
|
| MD5 |
b317560742c2b41d75fa2dcb6d13ffdb
|
|
| BLAKE2b-256 |
1c8b61d1ffb53d2e8de7229293acfc6a2a1b4c69a1ecf3a762fff0141af00108
|
Provenance
The following attestation bundles were made for fastbloomfilter-0.0.13-py3-none-any.whl:
Publisher:
pypi-publish.yml on daedalus/fastBloomFilter
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fastbloomfilter-0.0.13-py3-none-any.whl -
Subject digest:
943997559608722ad6bce825c6990210368c6c28ccc955c39495919a747c9e96 - Sigstore transparency entry: 1178857831
- Sigstore integration time:
-
Permalink:
daedalus/fastBloomFilter@4a346598b646e43cb3258e9f44134f4cc5274966 -
Branch / Tag:
refs/tags/v0.0.13 - Owner: https://github.com/daedalus
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@4a346598b646e43cb3258e9f44134f4cc5274966 -
Trigger Event:
release
-
Statement type: