llmeter

A lightweight, cross-platform latency and throughput profiler for LLMs

These details have been verified by PyPI

Project links

Repository

GitHub Statistics

Maintainers

llmeter-maintainers

These details have not been verified by PyPI

Project description

Measuring large language models latency and throughput

LLMeter is a pure-python library for simple latency and throughput testing of large language models (LLMs). It's designed to be lightweight to install; straightforward to run standard tests; and versatile to integrate - whether in notebooks, CI/CD, or other workflows.

📖 For full details, check out our documentation at: https://awslabs.github.io/llmeter

🛠️ Installation

LLMeter requires python>=3.10, please make sure your current version of python is compatible.

To install the basic metering functionalities, you can install the minimum package using pip or uv:

pip install llmeter

Or with uv (recommended for faster installation):

uv pip install llmeter

LLMeter also offers extra features that require additional dependencies. Currently these extras include:

plotting: Add methods to generate charts to summarize the results
openai: Enable testing endpoints offered by OpenAI
litellm: Enable testing a range of different models through LiteLLM
mlflow: Enable logging LLMeter experiments to MLFlow

You can install one or more of these extra options using pip:

pip install 'llmeter[plotting,openai,litellm,mlflow]'

Or with uv:

uv pip install 'llmeter[plotting,openai,litellm,mlflow]'

🚀 Quick-start

At a high level, you'll start by configuring an LLMeter "Endpoint" for whatever type of LLM you're connecting to:

# For example with Amazon Bedrock...
from llmeter.endpoints import BedrockConverse
endpoint = BedrockConverse(model_id="...")

# ...or OpenAI...
from llmeter.endpoints import OpenAIEndpoint
endpoint = OpenAIEndpoint(model_id="...", api_key="...")

# ...or via LiteLLM...
from llmeter.endpoints import LiteLLM
endpoint = LiteLLM("{provider}/{model_id}")

# ...and so on

You can then run the high-level "experiments" offered by LLMeter:

# Testing how throughput varies with concurrent request count:
from llmeter.experiments import LoadTest
load_test = LoadTest(
    endpoint=endpoint,
    payload={...},
    sequence_of_clients=[1, 5, 20, 50, 100, 500],
    output_path="local or S3 path"
)
load_test_results = await load_test.run()
load_test_results.plot_results()

Where payload can be a single dictionary, a list of dictionary, or a path to a JSON Line file that contains a payload for every line.

Each LLMeter Endpoint type offers a create_payload() function you can use to help build your inputs, in case you're not sure of the request JSON format for your target API. For example with Amazon Bedrock Converse:

from llmeter.prompt_utils import ImageContent
payload = BedrockConverse.create_payload(
    user_messages=[
        "Describe the following image:",
        ImageContent.from_path("photo.jpg"),
    ],
    max_tokens=1024,
)

As well as the high-level Experiments, you can use the low-level llmeter.runner.Runner class to run and analyze request batches - and build your own custom experiments.

from llmeter.runner import Runner

endpoint_test = Runner(
    endpoint,
    tokenizer=tokenizer,
    output_path="local or S3 path",
)
result = await endpoint_test.run(
    payload={...},
    n_requests=3,
    clients=3,
)

print(result.stats)

Additional functionality like cost modelling and MLFlow experiment tracking is enabled through llmeter.callbacks, and you can write your own callbacks to hook other custom logic into LLMeter test runs.

For more details, check out the LLMeter user guide and our selection of end-to-end code examples in the examples folder!

Analyze and compare results

You can analyze the results of a single run or a load test by generating interactive charts. You can find examples in in the examples folder.

Load testing

You can generate a collection of standard charts to visualize the result of a load test:

# Load test results
from llmeter.experiments import LoadTestResult
load_test_result = LoadTestResult.load("local or S3 path", test_name="Test result")

figures = load_test_result.plot_results()



---	---

You can see how to compare two load test in Compare load test.

Single Run visualizations

Metrics like time to first token (TTFT) and time per output token (TPOT) are described as distributions. While statistical descriptions of these distributions (median, 90th percentile, average, etc.) are a convenient way to compare them, visualizations provide insights on the endpoint behavior.

Boxplot

import plotly.graph_objects as go
from llmeter.plotting import boxplot_by_dimension

result = Result.load("local or S3 path")

fig = go.Figure()
trace = boxplot_by_dimension(result=result, dimension="time_to_first_token")
fig.add_trace(trace)

Multiple traces can easily be combined into the same figure.

alt text

Histograms

import plotly.graph_objects as go
from llmeter.plotting import histogram_by_dimension

result = Result.load("local or S3 path")

fig = go.Figure()
trace = histogram_by_dimension(result=result, dimension="time_to_first_token", xbins={"size":0.02})
fig.add_trace(trace)

Multiple traces can easily be combined into the same figure.

alt text

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Project details

These details have been verified by PyPI

Project links

Repository

GitHub Statistics

Maintainers

llmeter-maintainers

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.12

Jun 24, 2026

0.1.11

Apr 24, 2026

0.1.10.1 yanked

Apr 22, 2026

Reason this release was yanked:

Regression: breaks RunningStats datetime comparisons. Use 0.1.10 instead.

0.1.10

Apr 22, 2026

0.1.9

Mar 30, 2026

0.1.8

Mar 24, 2026

0.1.7

Dec 9, 2025

0.1.6

Nov 18, 2025

0.1.5

Mar 17, 2025

0.1.4

Dec 19, 2024

0.1.3

Oct 9, 2024

0.1.2

Oct 7, 2024

0.1.1

Oct 6, 2024

0.1.0

Oct 6, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmeter-0.1.12.tar.gz (572.2 kB view details)

Uploaded Jun 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llmeter-0.1.12-py3-none-any.whl (112.2 kB view details)

Uploaded Jun 24, 2026 Python 3

File details

Details for the file llmeter-0.1.12.tar.gz.

File metadata

Download URL: llmeter-0.1.12.tar.gz
Upload date: Jun 24, 2026
Size: 572.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llmeter-0.1.12.tar.gz
Algorithm	Hash digest
SHA256	`4ccd3355702a70a9b3f1c36b44ebf80284683f5b65fd05eb44da09ca96d545b6`
MD5	`efc2af229e9472a399189777552313f9`
BLAKE2b-256	`631eccfa33cf9a2250762f6767a48e87a10d1f28754bbb6f25f34afab0db887a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for llmeter-0.1.12.tar.gz:

Publisher: pypi.yml on awslabs/llmeter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: llmeter-0.1.12.tar.gz
- Subject digest: 4ccd3355702a70a9b3f1c36b44ebf80284683f5b65fd05eb44da09ca96d545b6
- Sigstore transparency entry: 1936231403
- Sigstore integration time: Jun 24, 2026
Source repository:
- Permalink: awslabs/llmeter@1c0a6e1aae36502c6a9076887e9bf0fde567f892
- Branch / Tag: refs/tags/v0.1.12
- Owner: https://github.com/awslabs
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi.yml@1c0a6e1aae36502c6a9076887e9bf0fde567f892
- Trigger Event: push

File details

Details for the file llmeter-0.1.12-py3-none-any.whl.

File metadata

Download URL: llmeter-0.1.12-py3-none-any.whl
Upload date: Jun 24, 2026
Size: 112.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llmeter-0.1.12-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6f0e5ee9aef3be49a3713c5b5d1b92ead6ebcd00b07ff6c361a29427fa02c871`
MD5	`c7d17db885f609d1cb24a872e9ba0bb1`
BLAKE2b-256	`c0aa026f65f0e92461a61ea0b77b5b3d01f94a5bab51b7d25e289dbd01c606a1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for llmeter-0.1.12-py3-none-any.whl:

Publisher: pypi.yml on awslabs/llmeter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: llmeter-0.1.12-py3-none-any.whl
- Subject digest: 6f0e5ee9aef3be49a3713c5b5d1b92ead6ebcd00b07ff6c361a29427fa02c871
- Sigstore transparency entry: 1936231590
- Sigstore integration time: Jun 24, 2026
Source repository:
- Permalink: awslabs/llmeter@1c0a6e1aae36502c6a9076887e9bf0fde567f892
- Branch / Tag: refs/tags/v0.1.12
- Owner: https://github.com/awslabs
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi.yml@1c0a6e1aae36502c6a9076887e9bf0fde567f892
- Trigger Event: push

llmeter 0.1.12

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

🛠️ Installation

🚀 Quick-start

Analyze and compare results

Load testing

Single Run visualizations

Boxplot

Histograms

Security

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance