Skip to main content

Offline and online benchmarking utilities for large language model workloads

Project description

BatchBench

BatchBench bundles three benchmarking utilities behind installable Python entrypoints:

  • batchbench.generate produces JSONL request corpora with controllable prefix overlap and approximate token counts.
  • batchbench.offline drives an offline vLLM workload to record prompt and generation throughput.
  • batchbench.online launches the packaged Rust binary that fans requests out to OpenAI-compatible endpoints in parallel.

Installation

pip install batchbench

Optional extras install tool-specific dependencies:

pip install "batchbench[generate]"   # adds transformers for prompt sizing
pip install "batchbench[offline]"    # adds vllm for the offline benchmark

Generating Requests

batchbench.generate \
  --count 100 \
  --prefix-overlap 0.3 \
  --approx-input-tokens 512 \
  --tokenizer-model gpt-3.5-turbo \
  --output data

Each row in the resulting JSONL file has a text field. The filename embeds run metadata (count, tokens, prefix, tokenizer) to keep runs distinct.

Offline Benchmarking

The offline harness requires vLLM and a compatible model checkpoint.

batchbench.offline \
  --model facebook/opt-125m \
  --num_reqs 2048 \
  --icl 1024 \
  --ocl 1

The command prints prompt/generation throughput statistics and writes the sampled history to vllm_throughput_history.csv (configurable via --throughput_csv).

Online Benchmarking

batchbench.online wraps the Rust executable that used to live under rust-bench/. The binary ships inside the wheel, so Cargo is not required on the host.

batchbench.online \
  --jsonl data/requests.jsonl \
  --model gpt-4o-mini \
  --host https://api.openai.com \
  --endpoint /v1/chat/completions \
  --users 8 \
  --requests-per-user 1

Provide an API key via --api-key or the environment variable named by --api-key-env (defaults to OPENAI_API_KEY).

Development Notes

The project now follows a src/ layout. Run pip install -e .[generate,offline] during development to work against the editable package. The Rust binary can be rebuilt with cargo build --release inside rust-bench/; copy the resulting executable to src/batchbench/bin/ if you need to refresh it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

batchbench-0.1.1.tar.gz (2.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

batchbench-0.1.1-py3-none-any.whl (2.5 MB view details)

Uploaded Python 3

File details

Details for the file batchbench-0.1.1.tar.gz.

File metadata

  • Download URL: batchbench-0.1.1.tar.gz
  • Upload date:
  • Size: 2.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for batchbench-0.1.1.tar.gz
Algorithm Hash digest
SHA256 6a1fed8058ac3ff110a7b1d21b802ddccc6ab7b34349a18dbc979656b5643121
MD5 b3e42a3f10880b5173484134596f4aa5
BLAKE2b-256 e9b9d614cf926e1f2b15f8843c6d939b8192e94aebb13841db5ea0177ce34941

See more details on using hashes here.

File details

Details for the file batchbench-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: batchbench-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 2.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for batchbench-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e24b62f939b32dc8e628b05aa1b4f246364c41438c571f58f5999ccde5d33b99
MD5 c1a61ceeec7fa637c350176582d2a35d
BLAKE2b-256 a380cbc9313ede83117f280fe2ecc113bb284f9bd0f78637781f07d0c553e2fd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page