Skip to main content

Python library which implements a Redis-backed Bloom filter.

Project description

ibloom

this is a fork of pyreBloom-ng, pyreBloom-ng is a python library which implements a Redis-backed Bloom filter.

pyreBloom-ng is really powerful. but it's setup.py and tests and bench/benchmark.py are all outdated, the repo's last commit is 4 years ago.

based on pyreBloom-ng and added supported for python3's str, avoid of annoying b'some_key'

Installation

pre-requirement

ibloom requires hiredis library, Cython and a C compiler

hiredis

# Mac
brew install hiredis

# ubuntu
apt-get install libhiredis-dev

# From source:
git clone https://github.com/redis/hiredis
cd hiredis && make && sudo make install

Cython

pip install Cython

Startup

init an instance

from ibloom import IBloom
ib = IBloom('ibloomI', 1000, 0.01, '127.0.0.1', 6383)

or

from ibloom import IBloom
ib_n = IBloom(key='ibloomN', capacity=1000, error=0.01, host='127.0.0.1', port=6383)

check basic info

# You can find out how many bits this will theoretically consume
>>> ib.bits
9585
# And how many hashes are needed to satisfy the false positive rate
>>> ib.hashes
7
# find all available bloom filter keys
>>> ib.keys()
['ibloomI.0']

add data

add all supplied

# Add one value at a time (slow)
>>> ib.add('first')
True
# Or use batch operations (faster).
>>> ib.update([f'{x}' for x in range(5)])
5
# Alternative: ib += data, but this will return nothing
>>> ib += [f'{x + 5}' for x in range(5)]

only add if not exist

# will first get the difference, and then update them to redis, and return them
>>> ib.update_difference(['5', '6', '7', '8', '9', '10'])
['10']

check if key exists

find one

# Test one value at a time (slow).
# . in ...
>>> 'first' in ib
True
# ...contains(.)
>>> ib.contains('first')
True

find multiple

# Use batch operations (faster).
# Note: ibloom.intersection() returns a list of values
# which are found in a Bloom filter. It makes sense when
# you consider it a set-like operation.
>>> ib.intersection(['3', '4', '5', '6'])
['3', '4', '5', '6']
# Alternative: ib & [b'3', b'4', b'5', b'6']
>>> ib & ['3', '4', '5', '6', '9', '10']
['3', '4', '5', '6', '9']

find non exist

>>> ib.difference(['5', '6', '7', '8', '9', '10'])
['10']
# not recommended, maybe update in the future
# Alternative: ib ^ ['5', '6', '7', '8', '9', '10']
>>> ib ^ ['5', '6', '7', '8', '9', '10']
['10']

delete the bloom key

# delete self
ib.delete()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ibloom-0.0.2.1.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

ibloom-0.0.2.1-cp38-cp38-macosx_10_15_x86_64.whl (35.3 kB view details)

Uploaded CPython 3.8 macOS 10.15+ x86-64

File details

Details for the file ibloom-0.0.2.1.tar.gz.

File metadata

  • Download URL: ibloom-0.0.2.1.tar.gz
  • Upload date:
  • Size: 12.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.4.0 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.8.5

File hashes

Hashes for ibloom-0.0.2.1.tar.gz
Algorithm Hash digest
SHA256 b360002985062bcc433c5059dd792be5a0809809588c2bf30fd7989739124f6d
MD5 36e5ec7aa857e680ad4ff7a3696c8414
BLAKE2b-256 e0a6ddfe35b03a142e89f434018b2994d9f8a6d77d685131071b531acc4d0531

See more details on using hashes here.

File details

Details for the file ibloom-0.0.2.1-cp38-cp38-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: ibloom-0.0.2.1-cp38-cp38-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 35.3 kB
  • Tags: CPython 3.8, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/46.4.0 requests-toolbelt/0.9.1 tqdm/4.50.0 CPython/3.8.5

File hashes

Hashes for ibloom-0.0.2.1-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 276cd8e2774ce3e2fad939c1eaa3ad450bdcf24ab263d678336ab7162357f7f3
MD5 4980aa4d813c3d4ae0ce1b714f787082
BLAKE2b-256 f19dcd4ca30814490900bcec8ae9c5c5f1ce1f345d38c76b1b57716f1f35ec04

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page