LLM Inference Benchmark CLI - measure TTFT, TPS, ITL, E2E latency for any OpenAI-compatible API

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kuhung

These details have not been verified by PyPI

Project description

llm-benchmark-runner

LLM 推理性能测评 CLI 工具 -- 测量 TTFT / TPS / ITL / E2E 延迟，输出标准 JSON 可直接导入 Web 端查看可视化图表。

适用于浏览器无法触达的场景（CORS 未配置、无头服务器 SSH 环境、CI/CD 集成等）。

安装

pip：

pip install llm-benchmark-runner

uv：

uv pip install llm-benchmark-runner

从源码（开发模式）：

cd runner
pip install -e .

使用

# 基本用法
llm-benchmark --url http://localhost:11434 --model llama3.2

# 完整参数
llm-benchmark \
  --url http://localhost:11434 \
  --model llama3.2 \
  --name "My Ollama" \
  --prompt "Write a short essay about AI." \
  --max-tokens 512 \
  --repeat 10 \
  --concurrency 1,2,4,8 \
  --output results.json

# 也可以用 python -m 方式运行
python -m llm_benchmark_runner --url http://localhost:11434 --model llama3.2

参数说明

参数	默认值	说明
`--url`	(必填)	模型 API 的 Base URL
`--model`	(必填)	Model ID
`--name`	自动生成	端点显示名称
`--api-key`	空	API Key
`--prompt`	内置	测试 Prompt
`--max-tokens`	256	最大输出 token 数
`--repeat`	5	每个并发级别重复次数
`--concurrency`	1,2,4,8	并发级别（逗号分隔）
`--output`	自动命名	输出 JSON 文件路径
`--version`	-	显示版本号

输出格式

输出的 JSON 文件遵循 BenchmarkSession 标准格式，包含：

测评配置（prompt、maxTokens、repeatCount、concurrencyLevels）
每个端点的聚合指标（TTFT / TPS / ITL / E2E 的 mean/median/p95/p99/min/max/stdDev）
并发压测结果（各并发级别的吞吐量和延迟）
五维雷达评分（Speed / Responsiveness / Smoothness / Scalability / Stability）
原始请求数据（逐请求的 token 时间戳）

导入 Web 端

输出的 JSON 文件可直接导入 Web 端的 "历史记录 -> 导入 JSON" 查看可视化图表：

打开 Web 端（https://benchmark-for-llm.vercel.app）
切换到 "历史记录" Tab
点击 "导入" 按钮
选择 CLI 输出的 JSON 文件

支持的 API 格式

OpenAI Chat Completions API（/v1/chat/completions）
Ollama（兼容 OpenAI 格式 + 原生格式）
LM Studio
vLLM
llama.cpp
任何支持 SSE streaming 的 OpenAI 兼容 API

开发

cd runner
uv build          # 构建分发包
uv publish        # 发布到 PyPI（需要配置 token）

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kuhung

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.0

May 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_benchmark_runner-0.2.0.tar.gz (12.2 kB view details)

Uploaded May 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_benchmark_runner-0.2.0-py3-none-any.whl (9.9 kB view details)

Uploaded May 20, 2026 Python 3

File details

Details for the file llm_benchmark_runner-0.2.0.tar.gz.

File metadata

Download URL: llm_benchmark_runner-0.2.0.tar.gz
Upload date: May 20, 2026
Size: 12.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llm_benchmark_runner-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`2cb708845711021ff50c248bd792c41a9766edc3a908e62cd6b0d54b0d860aa8`
MD5	`60494a87927f6e685e3659a3d35e05e8`
BLAKE2b-256	`083b57328cb25515cba18bcc65b9941a3923455942d49a00e76fa6c84d4a1b6c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_benchmark_runner-0.2.0.tar.gz:

Publisher: publish-pypi.yml on kuhung/benchmark-for-LLM

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: llm_benchmark_runner-0.2.0.tar.gz
- Subject digest: 2cb708845711021ff50c248bd792c41a9766edc3a908e62cd6b0d54b0d860aa8
- Sigstore transparency entry: 1582383507
- Sigstore integration time: May 20, 2026
Source repository:
- Permalink: kuhung/benchmark-for-LLM@82029dd1ae916efbaf0f84c1f425b7ae01c79a66
- Branch / Tag: refs/heads/main
- Owner: https://github.com/kuhung
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@82029dd1ae916efbaf0f84c1f425b7ae01c79a66
- Trigger Event: workflow_dispatch

File details

Details for the file llm_benchmark_runner-0.2.0-py3-none-any.whl.

File metadata

Download URL: llm_benchmark_runner-0.2.0-py3-none-any.whl
Upload date: May 20, 2026
Size: 9.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llm_benchmark_runner-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`405c410b2c97048905243e84dd0b11a330d9445ef710fb81d9480d9a4ff25bfe`
MD5	`9ec94b72c8cab72adc933b9406461f8c`
BLAKE2b-256	`022d2736f9d0c244b79ccf30e2740e25a7b1572037c8560a8b444a0cd94cd6de`

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_benchmark_runner-0.2.0-py3-none-any.whl:

Publisher: publish-pypi.yml on kuhung/benchmark-for-LLM

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: llm_benchmark_runner-0.2.0-py3-none-any.whl
- Subject digest: 405c410b2c97048905243e84dd0b11a330d9445ef710fb81d9480d9a4ff25bfe
- Sigstore transparency entry: 1582383625
- Sigstore integration time: May 20, 2026
Source repository:
- Permalink: kuhung/benchmark-for-LLM@82029dd1ae916efbaf0f84c1f425b7ae01c79a66
- Branch / Tag: refs/heads/main
- Owner: https://github.com/kuhung
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@82029dd1ae916efbaf0f84c1f425b7ae01c79a66
- Trigger Event: workflow_dispatch

llm-benchmark-runner 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

llm-benchmark-runner

安装

使用

参数说明

输出格式

导入 Web 端

支持的 API 格式

开发

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance