LLM benchmarking tools for the LLM CLI
Project description
LLM Benchmarking Plugin
This is a plugin for the llm tool that adds a benchmark command to compare the performance of different language models.
The commands runs a prompt with optional system prompt for several models and compares the performance between models.
Installation
You can install the plugin using pip:
pip install llm-profile
or using llm
llm install llm-profile
Metrics
- Total time - The time taken from the request to the end of the final chunk
- Time to First Chunk - The time taken from the request to the first chunk of the response
- Length of Response - The length of the response text
- Number of Chunks - The number of chunks in the response
- Chunks per Second - The number of chunks divided by the total time taken
Benchmark Usage
To run a benchmark, provide the prompt along with any number of models using the llm alias (from llm models):
$ llm benchmark -m azure/ant-grok-3-mini -m azure/ants-gpt-4.1-mini -s "Respond in emoji" "Give me a friendly hello message" --markdown
For a single pass (no repeats) you will get a summary table:
| Benchmark | Total Time | Time to First Chunk | Length of Response | Number of Chunks | Chunks per Second |
|---|---|---|---|---|---|
| azure/ant-grok-3-mini | 7.79 | 7.76 | 112 | 30 | 3.85 |
| azure/ants-gpt-4.1-mini | 2.99 | 2.80 | 78 | 19 | 6.36 |
To repeat each benchmark and get an average of times, use the --repeat argument:
| Benchmark | Total Time | Time to First Chunk | Length of Response | Number of Chunks | Chunks per Second |
|---|---|---|---|---|---|
| azure/ant-grok-3-mini | 2.59 <-> 8.39 (x̄=5.49) | 2.57 <-> 8.36 (x̄=5.47) | 65 <-> 109 (x̄=87.00) | 18 <-> 30 (x̄=24.00) | 2.15 <-> 11.58 (x̄=6.86) |
| azure/ants-gpt-4.1-mini | 0.54 <-> 2.88 (x̄=1.71) | 0.26 <-> 2.69 (x̄=1.47) | 76 <-> 78 (x̄=77.00) | 19 <-> 19 (x̄=19.00) | 6.60 <-> 35.17 (x̄=20.89) |
The printout is a range (min <-> max (x̄=mean))
Markdown formatted results
By default, tables are printed with color showing the fastest and slowest metric in a benchmark:
If you want to customize the output, you can use the --markdown flag to get the results in a Markdown-friendly format.
Non-Streaming models
If you want to benchmark models that do not support streaming, you can use the --no-stream flag. This will disable streaming and provide a single response time.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_profile-0.2.0.tar.gz.
File metadata
- Download URL: llm_profile-0.2.0.tar.gz
- Upload date:
- Size: 5.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d00151fd0cf4d97b0f3e943e5eea47345f7e937d58d1456df911dd3c9048eaa6
|
|
| MD5 |
4bc3338a41872dfcf6188bedd3d658cf
|
|
| BLAKE2b-256 |
9b514a8ea55c77dfcc6fcfb280bbb7653219d185bd480b79340343ad8a79d17e
|
Provenance
The following attestation bundles were made for llm_profile-0.2.0.tar.gz:
Publisher:
python-publish.yml on tonybaloney/llm-profile
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_profile-0.2.0.tar.gz -
Subject digest:
d00151fd0cf4d97b0f3e943e5eea47345f7e937d58d1456df911dd3c9048eaa6 - Sigstore transparency entry: 412009668
- Sigstore integration time:
-
Permalink:
tonybaloney/llm-profile@dd0ff7641421a4a332a003c69acaaef1e4e9c18d -
Branch / Tag:
refs/tags/0.2.0 - Owner: https://github.com/tonybaloney
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@dd0ff7641421a4a332a003c69acaaef1e4e9c18d -
Trigger Event:
release
-
Statement type:
File details
Details for the file llm_profile-0.2.0-py3-none-any.whl.
File metadata
- Download URL: llm_profile-0.2.0-py3-none-any.whl
- Upload date:
- Size: 5.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
598df5b647c902b6a40ce10b13f1d2f2b97a2ba9192af8c7fc63208ad16ac1de
|
|
| MD5 |
c67333befadde4c9e6af90d84711e9a1
|
|
| BLAKE2b-256 |
c6caf3a31556c9f27f93193ed5331d9828eea037037574873df67cf109873d6c
|
Provenance
The following attestation bundles were made for llm_profile-0.2.0-py3-none-any.whl:
Publisher:
python-publish.yml on tonybaloney/llm-profile
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_profile-0.2.0-py3-none-any.whl -
Subject digest:
598df5b647c902b6a40ce10b13f1d2f2b97a2ba9192af8c7fc63208ad16ac1de - Sigstore transparency entry: 412009677
- Sigstore integration time:
-
Permalink:
tonybaloney/llm-profile@dd0ff7641421a4a332a003c69acaaef1e4e9c18d -
Branch / Tag:
refs/tags/0.2.0 - Owner: https://github.com/tonybaloney
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@dd0ff7641421a4a332a003c69acaaef1e4e9c18d -
Trigger Event:
release
-
Statement type: