Skip to main content

Rust-backed performance metrics and request tracing

Project description

llm-runtime-metrics (Python)

Python bindings for request metrics and Prometheus export.

Install from PyPI as llm-runtime-metrics.

Import in Python as:

import llm_runtime_metrics

The supported package root focuses on the request-metrics workflow.

Add LLM Metrics To An Existing Prometheus Server

from prometheus_client import CollectorRegistry, start_http_server
from llm_runtime_metrics import (
    REQUEST_FEATURE_IMAGE,
    REQUEST_FEATURE_TOOLS,
    RequestMetricsCollector,
    RequestMetricsFactory,
)

# Reuse your existing registry if you already have one.
registry = CollectorRegistry()

factory = RequestMetricsFactory(
    request_log_enabled=False,
    metric_prefix="llm_runtime",
    metrics_window_seconds=60.0,
    metrics_quantiles=[0.5, 0.9, 0.99],
)

# Registers a custom collector that pulls fresh samples from `factory` at scrape time.
RequestMetricsCollector(
    factory,
    base_labels={"service": "text-generation", "engine": "vllm"},
    registry=registry,
)

# If your app already exposes /metrics, wire this into that server instead.
start_http_server(8000, registry=registry)


# Example lifecycle hooks in your inference code:
def on_request_start(prompt_token_ids: list[int]):
    features = REQUEST_FEATURE_TOOLS | REQUEST_FEATURE_IMAGE
    return factory.new_request(prompt_token_ids, features=features)


def on_stream_step(req_metrics, full_output_token_ids: list[int], cached_tokens: int | None):
    # Use `is_diff=False` when passing cumulative token ids.
    req_metrics.record_tokens(full_output_token_ids, cached_tokens=cached_tokens, is_diff=False)


def on_request_success(req_metrics):
    req_metrics.success()


def on_request_cancel(req_metrics):
    req_metrics.cancel()

The exported request latency distributions include request_metrics_ttft_ms, request_metrics_time_to_fifth_token_ms, request_metrics_first_to_fifth_token_itl_ms, and request_metrics_itl_ms. request_metrics_time_to_fifth_token_ms is the elapsed time from request start until the fifth output token is observed by record_tokens. request_metrics_first_to_fifth_token_itl_ms is the elapsed time between observing the first and fifth output tokens.

Available request feature bits:

  • REQUEST_FEATURE_NONE
  • REQUEST_FEATURE_XGRAMMAR
  • REQUEST_FEATURE_TOOLS
  • REQUEST_FEATURE_IMAGE

If you need plain text output instead of a collector, call:

text = factory.prometheus_strfmt({"service": "text-generation", "engine": "vllm"})

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_runtime_metrics-0.0.12.tar.gz (85.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

llm_runtime_metrics-0.0.12-cp314-cp314t-musllinux_1_2_x86_64.whl (7.1 MB view details)

Uploaded CPython 3.14tmusllinux: musl 1.2+ x86-64

llm_runtime_metrics-0.0.12-cp314-cp314t-musllinux_1_2_i686.whl (7.1 MB view details)

Uploaded CPython 3.14tmusllinux: musl 1.2+ i686

llm_runtime_metrics-0.0.12-cp314-cp314t-musllinux_1_2_armv7l.whl (6.8 MB view details)

Uploaded CPython 3.14tmusllinux: musl 1.2+ ARMv7l

llm_runtime_metrics-0.0.12-cp314-cp314t-musllinux_1_2_aarch64.whl (6.9 MB view details)

Uploaded CPython 3.14tmusllinux: musl 1.2+ ARM64

llm_runtime_metrics-0.0.12-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.17+ x86-64

llm_runtime_metrics-0.0.12-cp314-cp314t-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.1 MB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.17+ ppc64le

llm_runtime_metrics-0.0.12-cp314-cp314t-manylinux_2_17_i686.manylinux2014_i686.whl (7.3 MB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.17+ i686

llm_runtime_metrics-0.0.12-cp314-cp314t-macosx_11_0_arm64.whl (6.0 MB view details)

Uploaded CPython 3.14tmacOS 11.0+ ARM64

llm_runtime_metrics-0.0.12-cp314-cp314t-macosx_10_12_x86_64.whl (6.3 MB view details)

Uploaded CPython 3.14tmacOS 10.12+ x86-64

llm_runtime_metrics-0.0.12-cp310-abi3-win_amd64.whl (5.6 MB view details)

Uploaded CPython 3.10+Windows x86-64

llm_runtime_metrics-0.0.12-cp310-abi3-musllinux_1_2_x86_64.whl (7.1 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ x86-64

llm_runtime_metrics-0.0.12-cp310-abi3-musllinux_1_2_i686.whl (7.1 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ i686

llm_runtime_metrics-0.0.12-cp310-abi3-musllinux_1_2_armv7l.whl (6.8 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARMv7l

llm_runtime_metrics-0.0.12-cp310-abi3-musllinux_1_2_aarch64.whl (6.9 MB view details)

Uploaded CPython 3.10+musllinux: musl 1.2+ ARM64

llm_runtime_metrics-0.0.12-cp310-abi3-manylinux_2_28_ppc64le.whl (8.1 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ ppc64le

llm_runtime_metrics-0.0.12-cp310-abi3-manylinux_2_28_armv7l.whl (6.6 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ ARMv7l

llm_runtime_metrics-0.0.12-cp310-abi3-manylinux_2_28_aarch64.whl (6.7 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.28+ ARM64

llm_runtime_metrics-0.0.12-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

llm_runtime_metrics-0.0.12-cp310-abi3-manylinux_2_17_i686.manylinux2014_i686.whl (7.3 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ i686

llm_runtime_metrics-0.0.12-cp310-abi3-macosx_11_0_arm64.whl (6.0 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

llm_runtime_metrics-0.0.12-cp310-abi3-macosx_10_12_x86_64.whl (6.3 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file llm_runtime_metrics-0.0.12.tar.gz.

File metadata

  • Download URL: llm_runtime_metrics-0.0.12.tar.gz
  • Upload date:
  • Size: 85.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.14.0

File hashes

Hashes for llm_runtime_metrics-0.0.12.tar.gz
Algorithm Hash digest
SHA256 ef049ca85d9bd1a83ae49cb6a44ba6405795ab510e4f2abefc460e54b21ff3aa
MD5 4d5032f2b394d0a1169853c614a1b3cc
BLAKE2b-256 db5f68151367854497f3d2f883dcbab70410d0a68dfc08021bceed3c8dcd6e08

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp314-cp314t-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp314-cp314t-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 bf5c162a7916a1c975db168e2c274faf0cd50efec5f30d3c6bc6cea1bda928e1
MD5 810567afc84f454e1ad77b2a487589f8
BLAKE2b-256 531df4bf72edacda7962f9f3ced66613e57b2c78185ef73159c1a03a11aeec1c

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp314-cp314t-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp314-cp314t-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 11de391b9c9418628ca6e5b9f0012163874d89b2cf2627fead56a7509af5f3f9
MD5 9466eadfa545dfbf495d5247cc7d2f36
BLAKE2b-256 4fca8ee5675245564be5cbc4536e649a40d45e6c8683f3208456357c5baa3ceb

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp314-cp314t-musllinux_1_2_armv7l.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp314-cp314t-musllinux_1_2_armv7l.whl
Algorithm Hash digest
SHA256 d8e3f13f9d9e1ced2c37ce5d47886b69322f0338249c4f1a508ed462e268c173
MD5 cfd517d77bbf64b855aeec2e921d8327
BLAKE2b-256 75f5d282d4a57b4deafae5f04e22ab17f2bfdd5c63680afefa49a00e63212df8

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp314-cp314t-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp314-cp314t-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 3d5ebc211b505acc5b5380e273a27feff3956dd1dd1ee37c5ed29eafe90efad8
MD5 9d4b50e99f35d25aa75c096baadb44b8
BLAKE2b-256 1ba37f55b60f8f0a5b551b046569497f8978b02f63f88c32e936ea8d4fe8eca0

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a5b5544aebe92ca60e298fb110abc2c97f30cf3b6c038f0e9e0fd7b528472693
MD5 1e8006ee8182c4f35864eae9a0776b42
BLAKE2b-256 78412434345b09ec9083e6700ce056d1a0ae6646d4a048f7d79d5c7bb42e2e73

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp314-cp314t-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp314-cp314t-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 6a009a1047eb016631925a82716515dce81f3929571534b330779159d6735379
MD5 c4861995e2f94496e93bb712d4ba1d0e
BLAKE2b-256 293453fedaa48b048a1a13ecc165b6e69b73c70a1ad79b299f09b25719a72262

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp314-cp314t-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp314-cp314t-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 2562f774f26c195cc9d3f58e4fa19a55ad094811e0ae48d3e0222c061c874173
MD5 a67719ba047568c0c28a5fbf0419040b
BLAKE2b-256 3afc47b7d51de50c68d6c0c5b95af9f6409ebe90809806fe0574d458accbde3f

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp314-cp314t-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp314-cp314t-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 909afd38a01bcf17354152d83ce949a19b38b6ac7a5fb1e40ccd50d2f1506a89
MD5 4ebf857cb69db00d2f9249013194b345
BLAKE2b-256 9265ac770121fe0bafe8ca6ad7c51c3c4894538303714364d65739bda70d4fac

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp314-cp314t-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp314-cp314t-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 f3338ee57d27d4cd07ef015797d4b762f9fb1d555fcf284d8e03ccc9ec1ae96d
MD5 ae129819d246cd327a799ac703727e42
BLAKE2b-256 e952751740be35b82e5fb1e23ddf38a0079b529229477e083360bfed5f236b72

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp310-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 2d04a5c5245500fcfd343b7bd81af3fd50350d08ae1de8b92606bfff8f9e3ea8
MD5 f76d7d838fc780f46b08898dc50c4745
BLAKE2b-256 2c413654f9212ad0630528ef779d99662de496d1f2b7d8114edf2bd337776457

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp310-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp310-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 6d79eca56532533342d931b914351a0137c3ea3aacc6e4a84f6546d50c248ed1
MD5 0d2bb25525d1064acec36d2b35b19868
BLAKE2b-256 b58676729c61b306610b12454efd46b381d46786f7edd2b7c99e11c5ece7d2fb

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp310-abi3-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp310-abi3-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 1b7534307b4ee0c79b048caa25faf3de25ad3c3b4e093115b81e5e2442dc3cb7
MD5 2d749036fa9fad5cd8c84ca7c1abd7e4
BLAKE2b-256 3f787dbc7ab01da20d6bf8eddeee875d30f93d08e2ac5be15272b38d0ee9db28

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp310-abi3-musllinux_1_2_armv7l.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp310-abi3-musllinux_1_2_armv7l.whl
Algorithm Hash digest
SHA256 ec20c4e63f30072e80db9549619d2c6344908d59ee07ab79566845b3cb91639e
MD5 a5ca2f9e1568e9e7e81d4b6e785ca478
BLAKE2b-256 c7173f540bda95c64d103a34b3dc44854bc0c65e61854c5931426bb0ca3da3eb

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp310-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp310-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 f3ddf2b571cf340174b0d4e1b433188e9e246489457ec1595c4cd1d4dfb73691
MD5 9af70f94665421c61ff5c93dbb8a2d6f
BLAKE2b-256 badc711e4708ebc8695853ceeed386b01d52598b55feb30b064e1eccb1849bb5

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp310-abi3-manylinux_2_28_ppc64le.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp310-abi3-manylinux_2_28_ppc64le.whl
Algorithm Hash digest
SHA256 750982928cb3512c7d909a3b4cc151f50c0f4ab5031d465116790f6f907b6c00
MD5 59cb92ba52d6c2c26e6b695604a2f518
BLAKE2b-256 a7ea6a94eeeff9e37ac7cc6b5752710168bbb62f17fe896d80f943f89b0c2d30

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp310-abi3-manylinux_2_28_armv7l.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp310-abi3-manylinux_2_28_armv7l.whl
Algorithm Hash digest
SHA256 707b3b8939e2dcca685e8f1677937a035a5a4917362d11289cbe463fbb11c2af
MD5 adadfcdc8a1f89515e3a03487f9e1a0c
BLAKE2b-256 ff19a10d758b6a551fa28fdcd22597416167deca4f70ec1250a0f6350830fd54

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp310-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp310-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 b4a87682803f24e6517a53e3223af7e89782c049643381d736b20f79ab423889
MD5 39793cb0cfc54c68a66d095f29ac0dc0
BLAKE2b-256 37060dff305ee6e558ea4a7bc2478d094d44e6cc5796fb93849eb6896f77edec

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5785938044c5665ceb4f6626905353331262a73c05e6914527dfd129e8b7bec9
MD5 ea58fe5744bd6e1aba3d8884410c7ec3
BLAKE2b-256 c5db23ce9cd1c64967b6c0d761330b6cbfdfecba6100048bf6c79c1dbf99308a

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp310-abi3-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp310-abi3-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 c69a119d75f81295ffa2340d05dc812267ff26ae1781e7d68d376938ac0fcc5a
MD5 65f8dbcd122283e14356c2ee59080873
BLAKE2b-256 397b77f56ded09c582d0f708e3158d43e26eed757244a988e6ba0360ec990c83

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 930d08454f55205b94d3fd3e06127dc6b3aa7ade89414d0f5d8a9ea83b12a301
MD5 d7fcf6467e578b7838566492a30589c0
BLAKE2b-256 2e0140dfbc8ba0976930ec5b56015f0350cfd0813f381d1130351add25b2571e

See more details on using hashes here.

File details

Details for the file llm_runtime_metrics-0.0.12-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for llm_runtime_metrics-0.0.12-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 ee942917e22a7457fa752835907c476b5c82413339c390740dbc45fff1d94450
MD5 40d3eb14ccf969c6d6c518fba8136639
BLAKE2b-256 c4b6e03b80edf7da9c8fe9ec6fe6df77c66fabc0540ffc5ed660e8251b3a3a8a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page