Skip to main content

Lightweight monitoring dashboard for vLLM inference servers

Project description

vLLM Metrics Monitor

One command to monitor your vLLM server metrics

PyPI License: MIT

vLLM Metrics Monitor (vmm) is a lightweight dashboard that scrapes Prometheus metrics from vLLM, persists to SQLite, and serves a real-time web UI. Zero external dependencies — pure Python standard library.

Dashboard Preview

🚀 Quick Start

# Install with uv
uv tool install vllm-metrics-monitor

# Or install with pip: `pip install vllm-metrics-monitor`

# Launch dashboard
vmm http://your-vllm:8000/metrics

Open http://localhost:8080 — that's it.

📖 Usage

vmm [URL] [OPTIONS]

Positional:
  URL                   vLLM Prometheus metrics endpoint
                        (default: http://localhost:8000/metrics)

Options:
  -p, --port PORT       Dashboard HTTP port (default: 8080)
  -i, --interval SEC    Scrape interval in seconds (default: 3)
  --retention HOURS     Data retention period (default: 720, i.e. 30 days)
  --db PATH             SQLite database path (default: ~/.vmm/data.db)
  --reset               Delete existing database and start fresh
  --debug               Enable debug logging

Examples

vmm http://vllm-server:8000/metrics -p 9090 -i 5
vmm http://vllm-server:8000/metrics --reset
vmm http://vllm-server:8000/metrics --retention 72 --db /data/vmm.db

🏗️ Architecture

graph LR
    A[vLLM /metrics] -->|scrape every 3s| B[vmm]
    B --> C[Scraper Thread]
    C --> D[(SQLite<br/>~/.vmm/data.db)]
    B --> E[HTTP Server]
    E -->|JSON API| F[Browser<br/>Chart.js Dashboard]

📈 Monitored Metrics

Metric Source Type
Running Requests vllm:num_requests_running Gauge
Waiting Requests vllm:num_requests_waiting Gauge
KV Cache Usage vllm:kv_cache_usage_perc Gauge
Cache Hit Rate prompt_tokens_cached / prompt_tokens Derived
Requests/s vllm:request_success_total delta Counter rate
Output Tokens/s vllm:generation_tokens_total delta Counter rate
Input Tokens/s vllm:prompt_tokens_total delta Counter rate
TTFT time_to_first_token_seconds Histogram avg
ITL inter_token_latency_seconds Histogram avg
E2E Latency e2e_request_latency_seconds Histogram avg
Uptime process_start_time_seconds Gauge

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vllm_metrics_monitor-0.2.0.tar.gz (149.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vllm_metrics_monitor-0.2.0-py3-none-any.whl (28.2 kB view details)

Uploaded Python 3

File details

Details for the file vllm_metrics_monitor-0.2.0.tar.gz.

File metadata

  • Download URL: vllm_metrics_monitor-0.2.0.tar.gz
  • Upload date:
  • Size: 149.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vllm_metrics_monitor-0.2.0.tar.gz
Algorithm Hash digest
SHA256 6d8c56983c486905adbec961f6d64e783d69b80ef85e3c39e53c69d7a9a836f8
MD5 2384ba25b16c9abbae04ac4237fc9646
BLAKE2b-256 1a0fc92b6b85ef6a2a60935264c629e18be450a0fe1f0cf504152cafc888a45d

See more details on using hashes here.

File details

Details for the file vllm_metrics_monitor-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: vllm_metrics_monitor-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 28.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vllm_metrics_monitor-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a90ca2c13ccca8bfde945699d78685bbead4d7a4ebed7ac4a86699d873688499
MD5 a2ceb257a3f5710d9bca62c1ba4cfe8b
BLAKE2b-256 290c720d4c9d30c3931e23a2daac00c0364603165aa0e80c228bc09ac3e2163e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page