Fast, memory-efficient Bloom Filters backed by Redis / Valkey bitmaps.
Project description
ValBloom
Fast, memory-efficient Bloom Filters backed by Redis / Valkey bitmaps.
ValBloom provides three async-first probabilistic data structures for membership testing — ideal for token blacklisting, duplicate detection, rate limiting, and more.
Installation
pip install valbloom
Requires Python ≥ 3.9 and a running Redis or Valkey instance.
Quick Start
from redis.asyncio import Redis
from valbloom import BloomFilter
r = Redis.from_url("redis://localhost")
# Create a filter expecting up to 100k items with 0.1% false-positive rate
bf = BloomFilter(r, "bf:emails", capacity=100_000, error_rate=0.001)
await bf.add("alice@example.com")
await bf.exists("alice@example.com") # True
await bf.exists("bob@example.com") # False (probably)
Filter Types
BloomFilter — Standard
The classic space-efficient probabilistic set. Supports add and membership testing but not deletion.
bf = BloomFilter(client, "bf:tokens", capacity=100_000)
await bf.add("token-abc")
await bf.add_many(["token-1", "token-2", "token-3"])
await bf.exists("token-abc") # True
await bf.exists_many(["token-1", "nope"]) # {"token-1": True, "nope": False}
# Set operations (requires compatible filters — same capacity & error_rate)
merged = await bf1.union(bf2, dest_key="bf:merged")
common = await bf1.intersection(bf2, dest_key="bf:common")
CountingBloomFilter — Supports Deletion
Uses counter buckets (Redis hash) instead of single bits, allowing items to be removed.
from valbloom import CountingBloomFilter
cbf = CountingBloomFilter(client, "cbf:sessions", capacity=50_000)
await cbf.add("session-xyz")
await cbf.exists("session-xyz") # True
await cbf.remove("session-xyz")
await cbf.exists("session-xyz") # False
ScalableBloomFilter — Auto-Growing
Automatically creates new filter slices when the current one fills up, so the false-positive rate stays bounded even as data grows beyond the initial capacity.
from valbloom import ScalableBloomFilter
sbf = ScalableBloomFilter(
client, "sbf:users",
initial_capacity=1_000,
growth_factor=2, # each new slice doubles in capacity
ratio=0.9, # each slice tightens the error rate
)
# Add millions of items — ValBloom handles the scaling
for i in range(50_000):
await sbf.add(f"user-{i}")
Common API
All three filter types share these methods:
| Method | Description |
|---|---|
add(item) |
Add a single item |
exists(item) → bool |
Check membership (probabilistic) |
add_many(items) |
Bulk add |
exists_many(items) → dict |
Bulk check |
count → int |
Number of items added |
info() → dict |
Filter stats (size, fill ratio, etc.) |
clear() |
Reset the filter |
set_ttl(seconds) |
Set expiration on Redis keys |
get_ttl() → int |
Get remaining TTL |
CountingBloomFilter additionally supports:
| Method | Description |
|---|---|
remove(item) |
Remove a single item |
remove_many(items) |
Bulk remove |
BloomFilter additionally supports:
| Method | Description |
|---|---|
union(other, dest_key) |
Merge filters with BITOP OR |
intersection(other, dest_key) |
Intersect filters with BITOP AND |
How It Works
A Bloom filter uses a bit array of m bits and k independent hash functions. When adding an item, all k hash positions are set to 1. To check membership, all k positions are tested — if any is 0, the item is definitely not in the set.
ValBloom automatically computes optimal m and k from your desired capacity and error_rate:
- Bit-array size:
m = -(n × ln p) / (ln 2)² - Hash count:
k = (m / n) × ln 2
| Capacity | Error Rate | Memory | Hashes |
|---|---|---|---|
| 10,000 | 1% | ~12 KB | 7 |
| 100,000 | 0.1% | ~180 KB | 10 |
| 1,000,000 | 0.01% | ~2.3 MB | 13 |
When to Use Which
| Use case | Filter |
|---|---|
| Token blacklisting, email dedup, IP blocklisting | BloomFilter |
| Session tracking, temporary bans (need removal) | CountingBloomFilter |
| Unbounded growth, don't know final size | ScalableBloomFilter |
Development
# Install with dev dependencies
pip install -e ".[dev]"
# Run tests (uses fakeredis, no real Redis needed)
pytest tests/ -v
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file valbloom-0.1.0.tar.gz.
File metadata
- Download URL: valbloom-0.1.0.tar.gz
- Upload date:
- Size: 18.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
95d95148a3d0d0cde83790636d5debcba7a1eab12414e1ac135b085c3aa06156
|
|
| MD5 |
c92174144e25129bace69ff43a459cf9
|
|
| BLAKE2b-256 |
1c182b944b87d805d418d282d125a1d915ad947e95fdb91742c613906e7c33c8
|
File details
Details for the file valbloom-0.1.0-py3-none-any.whl.
File metadata
- Download URL: valbloom-0.1.0-py3-none-any.whl
- Upload date:
- Size: 13.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
213594e5271b98cb16d566324d531f28feb66eabe7aa07b9e160715dee3dba09
|
|
| MD5 |
0b15dbba82e0b538d731faa839dba5cb
|
|
| BLAKE2b-256 |
97eaa89d1eb3e86e1e7ed02108a37c9753362b0a2b340eddabf1532612db54e9
|