Skip to main content

FastDynamicBatcher is a library for batching inputs across requests to accelerate machine learning workloads.

Project description

Workflow Status

Fast Dynamic Batcher

Bundling several ML model inputs into a larger batch is the simplest way to achieve significant inference speed-ups in ML workloads. The Fast Dynamic Batcher library has been built to make it easy to use such dynamic batches in Python web frameworks like FastAPI. With our dynamic batcher, you can combine the inputs of several requests into a single batch, which can then be run more efficiently on GPUs. In our testing, we achieved up to 2.5x more throughput with it.

Example Usage

To use dynamic batching in FastAPI, you have to first create an instance of the InferenceModel class. Initiate your ML model in its init method and use it in its infer method:

from typing import Any
from fast_dynamic_batcher.inference_template import InferenceModel


class DemoModel(InferenceModel):
   def __init__(self):
       super().__init__()
       # Initiate your ML model here

   def infer(self, inputs: list[Any]) -> list[Any]:
       # Run your inputs as a batch for your model
       ml_output = ... # Your inference outputs
       return ml_output

Subsequently, use your InferenceModel instance to initiate our DynBatcher:

from contextlib import asynccontextmanager

from anyio import CapacityLimiter
from anyio.lowlevel import RunVar

from fast_dynamic_batcher.dyn_batcher import DynBatcher


@asynccontextmanager
async def lifespan(app: FastAPI):
   RunVar("_default_thread_limiter").set(CapacityLimiter(16))
   global dyn_batcher
   dyn_batcher = DynBatcher(DemoModel, max_batch_size = 8, max_delay = 0.1)
   yield
   dyn_batcher.stop()

app = FastAPI(lifespan=lifespan)

@app.post("/predict/")
async def predict(
   input_model: YourInputPydanticModel
):
   return await dyn_batcher.process_batched(input_model)

The DynBatcher can be initiated in the FastAPI lifespans as a global variable. It can be further customized with the max_batch_size and max_delay variables. Subsequently, use it in your FastAPI endpoints by registering your inputs by calling its process_batched method.

Our dynamic batching algorithm will then wait for either the number of inputs to equal the max_batch_size, or until max_delay seconds have passed. In the latter case, a batch may contain between 1 and max_batch_size inputs. Once, either condition is met, a batch will be processed by calling the infer method of your InferenceModel instance.

Installation

The Fast Dynamic Batcher library can be installed with pip:

pip install fast_dynamic_batcher

Performance Tests

We tested the performance of our dynamic batching solution against a baseline without batching on a Colab instance with a T4 GPU as well as on a laptop with an Intel i7-1250U CPU. The experiments were conducted by using this testing script. The results are reported in the table below:

Performance Experiments

Hardware

No Batching

Dynamic Batch size of 16

Colab T4 GPU

7.65s

3.07s

CPU Intel i7-1250U

117.10s

88.47s

On GPUs, which benefit greatly from large batch sizes, we achieved a speed-up of almost 2.5x by creating dynamic batches of size 16. On CPUs, the gains are more modest with a speed-up of 1.3x.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fast_dynamic_batcher-1.0.0.tar.gz (6.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fast_dynamic_batcher-1.0.0-py3-none-any.whl (7.9 kB view details)

Uploaded Python 3

File details

Details for the file fast_dynamic_batcher-1.0.0.tar.gz.

File metadata

  • Download URL: fast_dynamic_batcher-1.0.0.tar.gz
  • Upload date:
  • Size: 6.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.12.8 Linux/6.5.0-1025-azure

File hashes

Hashes for fast_dynamic_batcher-1.0.0.tar.gz
Algorithm Hash digest
SHA256 4680b51aaab488456c6fcc63693be45d59998aedaa5cf73f07e60d17d43dcc3b
MD5 58a060866823a90315719d05e76e77f2
BLAKE2b-256 fdcc980da4df38eca1a0b1ac22934f4ce742d9bb28bbd48288aa8fd1dbc691d0

See more details on using hashes here.

File details

Details for the file fast_dynamic_batcher-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: fast_dynamic_batcher-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 7.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.12.8 Linux/6.5.0-1025-azure

File hashes

Hashes for fast_dynamic_batcher-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 92e0d3d47455368548e245a5ebfcded801508365fc866725143e4ac70f7aa68a
MD5 7b54ec54bc262edae355bfb5a6aa8ca7
BLAKE2b-256 0d6e58be33b758e698e67c301d29a159b5970ed5c85843502a991ed38712e784

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page