Skip to main content

OpenTelemetry instrumentation for vLLM

Project description

TraceAI vLLM Instrumentation

OpenTelemetry instrumentation for vLLM, enabling comprehensive observability for local LLM inference.

Installation

pip install traceai-vllm

Quick Start

from fi_instrumentation import register
from fi_instrumentation.fi_types import ProjectType
from traceai_vllm import VLLMInstrumentor

# Setup TraceAI
trace_provider = register(
    project_type=ProjectType.OBSERVE,
    project_name="my-vllm-app",
)

# Instrument vLLM (specify your server URLs)
VLLMInstrumentor(
    vllm_base_urls=["localhost:8000"]  # Your vLLM server(s)
).instrument(tracer_provider=trace_provider)

# Now use OpenAI client with vLLM
from openai import OpenAI

client = OpenAI(
    api_key="token",  # vLLM doesn't require a real API key
    base_url="http://localhost:8000/v1",
)

response = client.chat.completions.create(
    model="meta-llama/Llama-2-7b-chat-hf",
    messages=[{"role": "user", "content": "Hello!"}]
)

Configuration

Custom vLLM Server URLs

You can specify multiple vLLM server URLs to instrument:

VLLMInstrumentor(
    vllm_base_urls=[
        "localhost:8000",
        "production-vllm.internal:8000",
        "staging-vllm.internal:8080",
    ]
).instrument(tracer_provider=trace_provider)

Features

  • Automatic tracing of vLLM API calls
  • Support for both synchronous and asynchronous operations
  • Streaming response support
  • Token usage tracking
  • Request/response attribute capture
  • Support for multiple vLLM server endpoints
  • OpenTelemetry semantic conventions for GenAI

Captured Attributes

  • gen_ai.request.model - Model name
  • gen_ai.request.max_tokens - Maximum tokens
  • gen_ai.request.temperature - Temperature setting
  • gen_ai.usage.input_tokens - Input token count
  • gen_ai.usage.output_tokens - Output token count
  • gen_ai.prompt.{n}.role - Message roles
  • gen_ai.prompt.{n}.content - Message contents
  • gen_ai.completion.{n}.content - Response content

vLLM-Specific Parameters

The instrumentation also captures vLLM-specific parameters when provided:

  • best_of - Number of output sequences to generate
  • use_beam_search - Whether to use beam search

Requirements

  • Python >= 3.9
  • openai >= 1.0.0
  • fi-instrumentation >= 0.1.0
  • opentelemetry-api >= 1.0.0
  • opentelemetry-sdk >= 1.0.0

Running vLLM Server

To use this instrumentation, you need a running vLLM server:

# Install vLLM
pip install vllm

# Start the server
python -m vllm.entrypoints.openai.api_server \
    --model meta-llama/Llama-2-7b-chat-hf \
    --port 8000

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

traceai_vllm-0.1.0.tar.gz (5.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

traceai_vllm-0.1.0-py3-none-any.whl (6.6 kB view details)

Uploaded Python 3

File details

Details for the file traceai_vllm-0.1.0.tar.gz.

File metadata

  • Download URL: traceai_vllm-0.1.0.tar.gz
  • Upload date:
  • Size: 5.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for traceai_vllm-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3eeaf73134be6471d9e7c3a7f1738b28c17fd1895abe05f79d8fd12930ce0e21
MD5 3f2372679fd591a4fd233b9459c8b6d8
BLAKE2b-256 424a8a2af4131eb85f8fb5419f0b3a6467e0102d813df630f9ab15cfdc24587b

See more details on using hashes here.

File details

Details for the file traceai_vllm-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: traceai_vllm-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for traceai_vllm-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 21cd4b0962936350b7524db555dc8c3c367752a4afaeab2d4d6e8e0a5936dfef
MD5 41a630ddcaca7801e1eee25ab5b51cb3
BLAKE2b-256 1cc0aa85a191a4fbd72a0e55ac6902f1604ebf1ccdb7bdd226fd9c6edcbcaf74

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page