Automatic micro-batching for HTTP LLM calls and local PyTorch inference, backed by a Rust core.

These details have not been verified by PyPI

Project links

Project description

llm-autobatch

Production-minded micro-batching for LLM calls and local PyTorch inference, backed by a single Rust core.

Viral simple: @autobatch turns single calls into efficient batches.
Adapter-based: swap HTTP or Torch executors without changing the core.
Rust-fast: thread-safe queues, micro-windows, and backpressure.

60-second Quickstart

pip install llm-autobatch

from llm_autobatch import autobatch

@autobatch(max_batch=32, max_wait_ms=10)
def call_llm(prompts: list[str]) -> list[str]:
    # Replace with a real batch call
    return [p.upper() for p in prompts]

print(call_llm("hello"))

Object-based API

from llm_autobatch import Batcher

batcher = Batcher(max_batch=32, max_wait_ms=10)

def batch_executor(items: list[str]) -> list[str]:
    return [s + "!" for s in items]

print(batcher.run("hi", executor=batch_executor))

HTTP adapter (OpenAI-style)

from llm_autobatch.http import OpenAIResponsesExecutor
from llm_autobatch import Batcher

executor = OpenAIResponsesExecutor(api_key="...", model="gpt-4o-mini")
batcher = Batcher(max_batch=32, max_wait_ms=10)

out = batcher.run("Explain Rust ownership", executor=executor)
print(out)

Torch adapter

from llm_autobatch.torch import TorchExecutor
from llm_autobatch import Batcher

executor = TorchExecutor(model=model, collate_fn=collate, device="cuda")
batcher = Batcher(max_batch=64, max_wait_ms=5)

print(batcher.run(x, executor=executor))

Benchmark

Run a local throughput test:

python benches/bench_throughput.py

Sample output (illustrative):

items=10000 max_batch=64 max_wait_ms=5  avg_batch=42.7  p99_ms=11.2

Why Rust?

Deterministic batching windows without Python GIL bottlenecks
Low-latency coordination under high concurrency
Single core reused across HTTP and Torch adapters
Memory safety while handling multithreaded queues

FAQ

Does this change my model API? No. You keep your executor; the core only handles batching and routing.

Do I need Rust installed? No. We publish prebuilt wheels for macOS, Linux, and Windows. pip install llm-autobatch should work without Rust.

How do I enable HTTP or Torch adapters? Install extras:

pip install llm-autobatch[http]
pip install llm-autobatch[torch]

What does backpressure do?

block: wait for space
drop: reject when full
passthrough: execute immediately

Can I use async? Not in v1. Async support is planned for v1.1.

Is ordering preserved? Yes. Outputs must match the input order for each batch.

License

Apache-2.0

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Feb 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_autobatch-0.1.1.tar.gz (13.2 kB view details)

Uploaded Feb 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_autobatch-0.1.1-cp39-abi3-macosx_11_0_arm64.whl (287.9 kB view details)

Uploaded Feb 10, 2026 CPython 3.9+macOS 11.0+ ARM64

File details

Details for the file llm_autobatch-0.1.1.tar.gz.

File metadata

Download URL: llm_autobatch-0.1.1.tar.gz
Upload date: Feb 10, 2026
Size: 13.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for llm_autobatch-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`7284a4d5f0d8c8cec6e6efc07ce9b37213f9f463e3298ef36088b93fb43558e2`
MD5	`fc2ab1bb7315a7212de0d2a27c7282f9`
BLAKE2b-256	`4a1dc703d55061ffb707a5cc4bb33a960c38c4ba5e735ed2ad838fa1c61f6609`

See more details on using hashes here.

File details

Details for the file llm_autobatch-0.1.1-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

Download URL: llm_autobatch-0.1.1-cp39-abi3-macosx_11_0_arm64.whl
Upload date: Feb 10, 2026
Size: 287.9 kB
Tags: CPython 3.9+, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for llm_autobatch-0.1.1-cp39-abi3-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`7ded2a9df40d111dac181510a48c485b059b815e4119354fb852afc05003cccf`
MD5	`a2d586d26693a7cd272d364016cda6bd`
BLAKE2b-256	`0a6dbb61173377179122fc54c9747b07336f619e2a15631dd060b32d78d51799`

See more details on using hashes here.

llm-autobatch 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

llm-autobatch

60-second Quickstart

Benchmark

Why Rust?

FAQ

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes