Python library which implements a Redis-backed Bloom filter.
Project description
ibloom
this is a fork of pyreBloom-ng, pyreBloom-ng is a python library which implements a Redis-backed Bloom filter.
pyreBloom-ng is really powerful. but it's setup.py and tests and bench/benchmark.py are all outdated, the repo's last commit is 4 years ago.
based on pyreBloom-ng and added supported for python3's str, avoid of annoying b'some_key'
Installation
pre-requirement
ibloom
requires hiredis
library, Cython
and a C compiler
hiredis
# Mac
brew install hiredis
# ubuntu
apt-get install libhiredis-dev
# From source:
git clone https://github.com/redis/hiredis
cd hiredis && make && sudo make install
Cython
pip install Cython
Startup
init an instance
from ibloom import IBloom
ib = IBloom('ibloomI', 1000, 0.01, '127.0.0.1', 6383)
or
from ibloom import IBloom
ib_n = IBloom(key='ibloomN', capacity=1000, error=0.01, host='127.0.0.1', port=6383)
check basic info
# You can find out how many bits this will theoretically consume
>>> ib.bits
9585
# And how many hashes are needed to satisfy the false positive rate
>>> ib.hashes
7
# find all available bloom filter keys
>>> ib.keys()
['ibloomI.0']
add data
add all supplied
# Add one value at a time (slow)
>>> ib.add('first')
True
# Or use batch operations (faster).
>>> ib.update([f'{x}' for x in range(5)])
5
# Alternative: ib += data, but this will return nothing
>>> ib += [f'{x + 5}' for x in range(5)]
only add if not exist
# will first get the difference, and then update them to redis, and return them
>>> ib.update_difference(['5', '6', '7', '8', '9', '10'])
['10']
check if key exists
find one
# Test one value at a time (slow).
# . in ...
>>> 'first' in ib
True
# ...contains(.)
>>> ib.contains('first')
True
find multiple
# Use batch operations (faster).
# Note: ibloom.intersection() returns a list of values
# which are found in a Bloom filter. It makes sense when
# you consider it a set-like operation.
>>> ib.intersection(['3', '4', '5', '6'])
['3', '4', '5', '6']
# Alternative: ib & [b'3', b'4', b'5', b'6']
>>> ib & ['3', '4', '5', '6', '9', '10']
['3', '4', '5', '6', '9']
find non exist
>>> ib.difference(['5', '6', '7', '8', '9', '10'])
['10']
# not recommended, maybe update in the future
# Alternative: ib ^ ['5', '6', '7', '8', '9', '10']
>>> ib ^ ['5', '6', '7', '8', '9', '10']
['10']
delete the bloom key
# delete self
ib.delete()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ibloom-0.0.1.3.tar.gz
(12.6 kB
view hashes)
Built Distribution
Close
Hashes for ibloom-0.0.1.3-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 57c18da92f370cb6547444211f27cabb64b7420d8292933e121051b195aabbf0 |
|
MD5 | c45d51d2d6a5847ff54546705c39d1ca |
|
BLAKE2b-256 | 57818d66b99c80154fa305de9109c99b83cc1c887500b65f930f3106654dde0c |