Skip to main content

Lightweight monitoring dashboard for vLLM inference servers

Project description

vLLM Metrics Monitor

One command to monitor your vLLM server metrics

PyPI License: MIT

vLLM Metrics Monitor (vmm) is a lightweight dashboard that scrapes Prometheus metrics from vLLM, persists to SQLite, and serves a real-time web UI. Zero external dependencies — pure Python standard library.

Dashboard Preview

🚀 Quick Start

# Install with uv
uv tool install vllm-metrics-monitor

# Or install with pip: `pip install vllm-metrics-monitor`

# Launch dashboard
vmm http://your-vllm:8000/metrics

Open http://localhost:8080 — that's it.

📖 Usage

vmm [URL] [OPTIONS]

Positional:
  URL                   vLLM Prometheus metrics endpoint
                        (default: http://localhost:8000/metrics)

Options:
  -p, --port PORT       Dashboard HTTP port (default: 8080)
  -i, --interval SEC    Scrape interval in seconds (default: 3)
  --retention HOURS     Data retention period (default: 720, i.e. 30 days)
  --db PATH             SQLite database path (default: ~/.vmm/data.db)
  --reset               Delete existing database and start fresh
  --debug               Enable debug logging

Examples

vmm http://vllm-server:8000/metrics -p 9090 -i 5
vmm http://vllm-server:8000/metrics --reset
vmm http://vllm-server:8000/metrics --retention 72 --db /data/vmm.db

🏗️ Architecture

graph LR
    A[vLLM /metrics] -->|scrape every 3s| B[vmm]
    B --> C[Scraper Thread]
    C --> D[(SQLite<br/>~/.vmm/data.db)]
    B --> E[HTTP Server]
    E -->|JSON API| F[Browser<br/>Chart.js Dashboard]

📈 Monitored Metrics

Metric Source Type
Running Requests vllm:num_requests_running Gauge
Waiting Requests vllm:num_requests_waiting Gauge
KV Cache Usage vllm:kv_cache_usage_perc Gauge
Cache Hit Rate prompt_tokens_cached / prompt_tokens Derived
Requests/s vllm:request_success_total delta Counter rate
Output Tokens/s vllm:generation_tokens_total delta Counter rate
Input Tokens/s vllm:prompt_tokens_total delta Counter rate
TTFT time_to_first_token_seconds Histogram avg
ITL inter_token_latency_seconds Histogram avg
E2E Latency e2e_request_latency_seconds Histogram avg
Uptime process_start_time_seconds Gauge

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vllm_metrics_monitor-0.2.1.tar.gz (149.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vllm_metrics_monitor-0.2.1-py3-none-any.whl (28.3 kB view details)

Uploaded Python 3

File details

Details for the file vllm_metrics_monitor-0.2.1.tar.gz.

File metadata

  • Download URL: vllm_metrics_monitor-0.2.1.tar.gz
  • Upload date:
  • Size: 149.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vllm_metrics_monitor-0.2.1.tar.gz
Algorithm Hash digest
SHA256 76845726b5d4a5ee077faa4392739b4cc9ca2b463bb26839b7278a134462a52e
MD5 86d1b49bc4b7bed102bf371ccbce647b
BLAKE2b-256 b8250b36051eba195a3f1ede796a00e55642401582556e3d5162665450bee517

See more details on using hashes here.

File details

Details for the file vllm_metrics_monitor-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: vllm_metrics_monitor-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 28.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vllm_metrics_monitor-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6d094d259e8e6641ca9ff51772f8b0ba205bcc44a31655a98ed5c1037d1628be
MD5 e9a9ad6ece579184f5a8c47137e7c046
BLAKE2b-256 694224643baf25c220d3065e05c9462d92f3bcbe350967e3045432a4b3366245

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page