Skip to main content

An async bloom filter library for Python.

Project description

🔮 aiobloom_live

A modern, high-performance, async-native Bloom filter library for Python

PyPI License Python Version


aiobloom_live is a powerful Bloom filter library built upon pybloom_live, fully embracing async/await syntax to deliver exceptional performance for high-concurrency I/O scenarios.

Whether you're processing massive data streams, building web crawlers, or in need of an efficient cache-miss detector, aiobloom_live provides a solution that combines an elegant API with ultimate performance.

✨ Core Features

  • 🚀 Blazing-Fast Async Performance: Implements asynchronous file I/O using aiofiles, achieving several times the performance of synchronous operations in concurrent scenarios.
  • 🧩 Two Filter Modes:
    • BloomFilter: The classic fixed-size Bloom filter.
    • ScalableBloomFilter: Automatically scales as the number of elements grows, no need to pre-estimate capacity.
  • 🕰️ Backward Compatible: Retains a synchronous API (tofile, fromfile) fully compatible with pybloom_live, ensuring a seamless migration path for existing users.
  • 🔧 Serialization Support: Supports serializing filters to bytes for easy transport over the network or in memory.

📊 Performance Benchmarks

The core advantage of aiobloom_live lies in its asynchronous I/O performance. Below are the results from a benchmark involving 16 concurrent read/write operations on a filter containing 10,000,000 elements with a 0.000001 error rate:

Operation (16 concurrent) Sync Async Performance Boost
File Write 0.3086s 0.2840s 1.09×
File Read 2.0815s 0.4776s 4.36×

Conclusion: In concurrent I/O-intensive tasks, the asynchronous model of aiobloom_live demonstrates a significant performance advantage, with read speeds boosted by nearly 4.5x!

🚀 Quick Start

Installation

pip install aiobloom_live

BloomFilter Example

import asyncio
from aiobloom_live import BloomFilter

async def main():
    # Create a filter
    bf = BloomFilter(capacity=1000, error_rate=0.001)

    # Add elements
    bf.add("hello")
    bf.add("world")

    # Check for existence
    assert "hello" in bf
    assert "python" not in bf

    # Asynchronously save to a file
    await bf.tofile_async("bloom.bin")

    # Asynchronously load from a file
    bf2 = await BloomFilter.fromfile_async("bloom.bin")
    assert "hello" in bf2
    print("✅ BloomFilter async read/write successful!")

if __name__ == "__main__":
    asyncio.run(main())

ScalableBloomFilter Example

import asyncio
from aiobloom_live import ScalableBloomFilter

async def main():
    # Create a scalable filter without worrying about capacity
    sbf = ScalableBloomFilter(initial_capacity=100, error_rate=0.001)

    # Add a large number of elements, the filter will expand automatically
    for i in range(500):
        sbf.add(f"item_{i}")

    assert "item_499" in sbf
    assert "item_500" not in sbf
    
    # Async save and load
    await sbf.tofile_async("sbf.bin")
    sbf2 = await ScalableBloomFilter.fromfile_async("sbf.bin")
    assert "item_499" in sbf2
    print("✅ ScalableBloomFilter async read/write successful!")

if __name__ == "__main__":
    asyncio.run(main())

API Reference

BloomFilter(capacity, error_rate)

  • add(key) / __contains__(key)[in operator]: Add and check for an element.
  • tofile_async(path) / fromfile_async(path): Asynchronously read from or write to a file.
  • tofile(file_obj) / fromfile(file_obj): Synchronously read from or write to a file (compatible with pybloom_live).
  • to_bytes() / from_bytes(data): Serialize and deserialize.

ScalableBloomFilter(initial_capacity, error_rate)

  • Similar functionality to BloomFilter, but with an automatically expanding capacity.
  • capacity: (Property) Get the current total capacity.

🤝 How to Contribute

We warmly welcome contributions of all forms! Whether it's submitting issues, creating Pull Requests, or improving documentation, your help is greatly appreciated by the community.

  1. Fork this repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

📄 License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aiobloom_live-0.1.1.tar.gz (14.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aiobloom_live-0.1.1-py3-none-any.whl (14.7 kB view details)

Uploaded Python 3

File details

Details for the file aiobloom_live-0.1.1.tar.gz.

File metadata

  • Download URL: aiobloom_live-0.1.1.tar.gz
  • Upload date:
  • Size: 14.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.11

File hashes

Hashes for aiobloom_live-0.1.1.tar.gz
Algorithm Hash digest
SHA256 c60e86e855d899938a38158efaf0c143a846bc7ba9f72a4bcd8b430907b92cfb
MD5 8912ba5c33cbf996cd20f4b525322c22
BLAKE2b-256 857e8417ee06d8de90f2f9a10d7922bd63d1ab1ae34b79bd1e9904bd79b44734

See more details on using hashes here.

File details

Details for the file aiobloom_live-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: aiobloom_live-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 14.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.11

File hashes

Hashes for aiobloom_live-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8d77369545dd3eeff451f02da08e2fd99d2a15e067a341bcbfe3bee0058704d8
MD5 85d29579c07a9927eb7099075deec098
BLAKE2b-256 81527aeb12821105cfe0a9096ca80597093fcaee448f0e60480261a96bf488da

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page