Skip to main content

A service to batch your http requests.

Project description

Async Batcher

This project provides a Python library to batch the asynchronous requests and handle them in batches.

How to use

To use the library, you need to install the package in your environment. You can install the package using pip:

pip install async-batcher

Then, you can create a new AsyncBatcher class by implementing the process_batch method:

from async_batcher.batcher import AsyncBatcher

class MyAsyncBatcher(AsyncBatcher):
    async def process_batch(self, batch):
        # Process the batch
        print(batch)

# Create a new instance of the `MyAsyncBatcher` class
async_batcher = MyAsyncBatcher(max_batch_size=20)
async_batcher.start()

Benchmark

To evaluate the performance of the AsyncBatcher library, we used the Keras example with locust to simulate multiple users making requests to the server.

In this example, we have a tensorflow model that takes ~11ms to make a single prediction and ~13ms to process a batch of 200 predictions.

For this benchmark, we ran the FastAPI server with Uvicorn with 12 workers, on a MacBook Pro with Apple M2 Max chip (12 cores).

Predict endpoint:

predict__total_requests_per_second.png

Optimized Predict endpoint:

optimized_predict__total_requests_per_second.png

As we can see from the graphs, for the /predict endpoint, the failure rate started to increase linearly at the RPS of ~284 requests/second, and the 95th percentile response time was ~1100ms at this point. Then with the increase of the RPS, the number of successful requests was ~130 requests/second in average, with a high failure rate (~300 from ~400 requests/second).

For the /optimized_predict endpoint, the failure was smaller than 3 requests/second (0.00625%) during the whole test, the average response time was increasing slightly with the increase of the RPS, and it reached ~120ms in average at the end of the test (> 480 requests/second), with a 95th percentile response time almost stable and smaller than 500ms.

You can find check the reports for more details:

Use cases

The AsyncBatcher library can be used in any application that needs to handle asynchronous requests in batches, such as:

  • Serving machine learning models that optimize the batch processing (e.g. TensorFlow, PyTorch, Scikit-learn, etc.)
  • Storing multiple records in a database in a single query to optimize the I/O operations (or to reduce the cost of the database operations, e.g. AWS DynamoDB)
  • Sending multiple messages in a single request to optimize the network operations (or to reduce the cost of the network operations, e.g. Kafka, RabbitMQ, AWS SQS, AWS SNS, etc.)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

async_batcher-0.2.0.tar.gz (13.4 kB view hashes)

Uploaded Source

Built Distribution

async_batcher-0.2.0-py3-none-any.whl (17.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page