Skip to main content

No project description provided

Project description

rBloom

Ultralightweight, blazing fast, minimalistic bloom filter library for Python, implemented in Rust

Usage

This library defines only one class, the signature of which should be thought of as:

class Bloom:

    def __init__(self, size_in_bits):
        ...

    def __contains__(self, object):
        ...

    def add(self, object):
        ...

See examples.

The size in bits is equal to the theoretical maximum amount of objects that could be contained in the filter. However, the filter should ideally be significantly larger than this to reduce the likelihood of birthday collisions, which in practice result in a false positive True returned by the __contains__ method. To decide on an ideal size, calculate size_in_bits by dividing the maximum number of expected items by the maximum acceptable likelihood of a false positive (e.g. 200 items / 0.01 likelihood = 20000 bits).

Building

Use maturin to build this library. As of the time of writing, this can be performed with:

$ pip install maturin
$ maturin build --release

This will result in the creation of a wheel, which can be found in target/wheels.

Examples

Most primitive example:

from rbloom import Bloom

filter = Bloom(200)

assert "hello" not in filter

filter.add "hello"

assert "hello" in filter

Print the first 1000 squares as well as around 0.001 = 0.1% of the numbers in between:

from rbloom import Bloom

filter = Bloom(int(1000 / 0.001))

for i in range(1000):
    filter.add(i*i)

for i in range(1000**2):
    if i in filter:
        print(i, end=" ")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rbloom-0.1.0.tar.gz (7.7 kB view hashes)

Uploaded Source

Built Distribution

rbloom-0.1.0-cp311-cp311-manylinux_2_34_x86_64.whl (198.7 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.34+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page