Skip to main content

Offline and online benchmarking utilities for large language model workloads

Project description

BatchBench

BatchBench bundles three benchmarking utilities behind installable Python entrypoints:

  • batchbench.generate produces JSONL request corpora with controllable prefix overlap and approximate token counts.
  • batchbench.offline drives an offline vLLM workload to record prompt and generation throughput.
  • batchbench.online launches the packaged Rust binary that fans requests out to OpenAI-compatible endpoints in parallel.

Installation

pip install batchbench

Optional extras install tool-specific dependencies:

pip install "batchbench[generate]"   # adds transformers for prompt sizing
pip install "batchbench[offline]"    # adds vllm for the offline benchmark

Generating Requests

batchbench.generate \
  --count 100 \
  --prefix-overlap 0.3 \
  --approx-input-tokens 512 \
  --tokenizer-model gpt-3.5-turbo \
  --output data

Each row in the resulting JSONL file has a text field. The filename embeds run metadata (count, tokens, prefix, tokenizer) to keep runs distinct.

Offline Benchmarking

The offline harness requires vLLM and a compatible model checkpoint.

batchbench.offline \
  --model facebook/opt-125m \
  --num_reqs 2048 \
  --icl 1024 \
  --ocl 1

The command prints prompt/generation throughput statistics and writes the sampled history to vllm_throughput_history.csv (configurable via --throughput_csv).

Online Benchmarking

batchbench.online wraps the Rust executable that used to live under rust-bench/. The binary ships inside the wheel, so Cargo is not required on the host.

batchbench.online \
  --jsonl data/requests.jsonl \
  --model gpt-4o-mini \
  --host https://api.openai.com \
  --endpoint /v1/chat/completions \
  --users 8 \
  --requests-per-user 1

Provide an API key via --api-key or the environment variable named by --api-key-env (defaults to OPENAI_API_KEY).

Development Notes

The project now follows a src/ layout. Run pip install -e .[generate,offline] during development to work against the editable package. The Rust binary can be rebuilt with cargo build --release inside rust-bench/; copy the resulting executable to src/batchbench/bin/ if you need to refresh it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

batchbench-0.1.0.tar.gz (2.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

batchbench-0.1.0-py3-none-any.whl (2.5 MB view details)

Uploaded Python 3

File details

Details for the file batchbench-0.1.0.tar.gz.

File metadata

  • Download URL: batchbench-0.1.0.tar.gz
  • Upload date:
  • Size: 2.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for batchbench-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e0b3cebd373ee27915a0f6f413ae2e4b970ea28038efa1b36e9a45bd7664f77c
MD5 94acff9884ab6670541b702ba1e12b0a
BLAKE2b-256 4e52df9a594bcb52e2948c7238637405cba5e1019df35833c33deed8ada082d4

See more details on using hashes here.

File details

Details for the file batchbench-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: batchbench-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 2.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for batchbench-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a796615159cf35e514e2c14ec1f8dde69d02b8e2819aec2def762551ec7e651b
MD5 ac897139930bd8ad73a462ddf9f05517
BLAKE2b-256 761073358def3a0e382cb82b4183c25c372f74e133a599be2e479972855ef9fd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page