Skip to main content

Python library which implements a Redis-backed Bloom filter.

Project description

ibloom

this is a fork of pyreBloom-ng, pyreBloom-ng is a python library which implements a Redis-backed Bloom filter.

pyreBloom-ng is really powerful. but it's setup.py and tests and bench/benchmark.py are all outdated, the repo's last commit is 4 years ago.

based on pyreBloom-ng and added supported for python3's str, avoid of annoying b'some_key'

Installation

pre-requirement

ibloom requires hiredis library, Cython and a C compiler

hiredis

# Mac
brew install hiredis

# ubuntu
apt-get install libhiredis-dev

# From source:
git clone https://github.com/redis/hiredis
cd hiredis && make && sudo make install

Cython

pip install Cython

Startup

init an instance

from ibloom import IBloom
ib = IBloom('ibloomI', 1000, 0.01, '127.0.0.1', 6383)

or

from ibloom import IBloom
ib_n = IBloom(key='ibloomN', capacity=1000, error=0.01, host='127.0.0.1', port=6383)

check basic info

# You can find out how many bits this will theoretically consume
>>> ib.bits
9585
# And how many hashes are needed to satisfy the false positive rate
>>> ib.hashes
7
# find all available bloom filter keys
>>> ib.keys()
['ibloomI.0']

add data

add all supplied

# Add one value at a time (slow)
>>> ib.add('first')
True
# Or use batch operations (faster).
>>> ib.update([f'{x}' for x in range(5)])
5
# Alternative: ib += data, but this will return nothing
>>> ib += [f'{x + 5}' for x in range(5)]

only add if not exist

# will first get the difference, and then update them to redis, and return them
>>> ib.update_difference(['5', '6', '7', '8', '9', '10'])
['10']

check if key exists

find one

# Test one value at a time (slow).
# . in ...
>>> 'first' in ib
True
# ...contains(.)
>>> ib.contains('first')
True

find multiple

# Use batch operations (faster).
# Note: ibloom.intersection() returns a list of values
# which are found in a Bloom filter. It makes sense when
# you consider it a set-like operation.
>>> ib.intersection(['3', '4', '5', '6'])
['3', '4', '5', '6']
# Alternative: ib & [b'3', b'4', b'5', b'6']
>>> ib & ['3', '4', '5', '6', '9', '10']
['3', '4', '5', '6', '9']

find non exist

>>> ib.difference(['5', '6', '7', '8', '9', '10'])
['10']
# not recommended, maybe update in the future
# Alternative: ib ^ ['5', '6', '7', '8', '9', '10']
>>> ib ^ ['5', '6', '7', '8', '9', '10']
['10']

delete the bloom key

# delete self
ib.delete()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ibloom-0.0.2.1.tar.gz (12.6 kB view hashes)

Uploaded Source

Built Distribution

ibloom-0.0.2.1-cp38-cp38-macosx_10_15_x86_64.whl (35.3 kB view hashes)

Uploaded CPython 3.8 macOS 10.15+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page