Skip to main content

No project description provided

Project description

Test Workflow

Fast Dynamic Batcher

Batching machine learning workloads is the easiest way to achieve significant inference speedups. The Fast Dynamic Batcher Library has been built to add easy support for dynamic batchtes to FastAPI. With our dynamic batcher, you can combine the inputs of several requests into a single batch, which will can then be run on your GPU. With our example project we measured speed-ups of up to 2.8x with dynamic batching compared to a baseline without it.

Example Usage

To use dynamic batching in your FastAPI workloads, you have to first create an instance of the InferenceModel class. Initiate your ML model in the init method and use it in the infer method:

from fast_dynamic_batcher.dyn_batcher import Task
from fast_dynamic_batcher.inference_template import InferenceModel


class DemoModel(InferenceModel):
   def __init__(self):
       super().__init__()
       # Initiate your ML model here


   def infer(self, tasks: list[Task]) -> list[Task]:
       # Process your input tasks
       inputs = [t.content for t in tasks]
       # Run your inputs as a batch for your model
       ml_output = None # Your inference outputs
       results = [
           Task(id=tasks[i].id, content=ml_output[i]) for i in range(len(tasks))
       ]
       return results

Subsequently, use your InferenceModel instance to initiate our DynBatcher:

from contextlib import asynccontextmanager

from anyio import CapacityLimiter
from anyio.lowlevel import RunVar

from fast_dynamic_batcher.dyn_batcher import DynBatcher


@asynccontextmanager
async def lifespan(app: FastAPI):
   RunVar("_default_thread_limiter").set(CapacityLimiter(16))
   global dyn_batcher
   dyn_batcher = DynBatcher(DemoModel, max_batch_size = 8, max_delay = 0.1)
   yield
   dyn_batcher.stop()

app = FastAPI(lifespan=lifespan)

@app.post("/predict/")
async def predict(
   input_model: YourInputPydanticModel
):
   return await dyn_batcher.process_batched(input_model)

The DynBatcher can be initiated in the FastAPI lifespans as a global variable. It can be further customized with the max_batch_size andmax_delay variables. Subsequently, it can be used in your FastAPI endpoints by registering your inputs by calling its process_batched method.

Our dynamic batching algorithm will then wait for either the number of inputs to equal max_batch_size, or until max_delay seconds have passed. In the latter case, a batch may contain between 1 and max_batch_size inputs. Once, either condition is met, a batch will be processed by calling the infer method of your InferenceModel instance.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fast_dynamic_batcher-0.1.1.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

fast_dynamic_batcher-0.1.1-py3-none-any.whl (6.2 kB view details)

Uploaded Python 3

File details

Details for the file fast_dynamic_batcher-0.1.1.tar.gz.

File metadata

  • Download URL: fast_dynamic_batcher-0.1.1.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.7 Linux/6.5.0-1025-azure

File hashes

Hashes for fast_dynamic_batcher-0.1.1.tar.gz
Algorithm Hash digest
SHA256 30d2869d902a3030d4fc15fe90abe0e388f37c5b60ec400ea83188f98b0e925d
MD5 68600375c9bf61c6077ec4516d829545
BLAKE2b-256 aaa064b748553d903862dcafae3d30feade7fca5bd36483b2583712e45241354

See more details on using hashes here.

File details

Details for the file fast_dynamic_batcher-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for fast_dynamic_batcher-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 802368c2b31a2d9c8614838a49d4eef2b4aaf67ec6151968a76b6f9d14eb296f
MD5 68f3e8052712c87c13a5301eb305fe61
BLAKE2b-256 892f08c1ccfa555d9ea22bc92285971cb25fe34c4a6b5c381be8e5b6e1dba97a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page