Skip to main content

Lightweight monitoring dashboard for vLLM inference servers

Project description

vLLM Metrics Monitor

One command to monitor your vLLM server metrics

PyPI License: MIT

vLLM Metrics Monitor (vmm) is a lightweight dashboard that scrapes Prometheus metrics from vLLM, persists to SQLite, and serves a real-time web UI. Zero external dependencies — pure Python standard library.

Dashboard Preview

🚀 Quick Start

# Install with uv
uv tool install vllm-metrics-monitor

# Or install with pip: `pip install vllm-metrics-monitor`

# Launch dashboard
vmm http://your-vllm:8000/metrics

Open http://localhost:8080 — that's it.

📖 Usage

vmm [URL] [OPTIONS]

Positional:
  URL                   vLLM Prometheus metrics endpoint
                        (default: http://localhost:8000/metrics)

Options:
  -p, --port PORT       Dashboard HTTP port (default: 8080)
  -i, --interval SEC    Scrape interval in seconds (default: 3)
  --retention HOURS     Data retention period (default: 720, i.e. 30 days)
  --db PATH             SQLite database path (default: ~/.vmm/data.db)
  --reset               Delete existing database and start fresh
  --debug               Enable debug logging

Examples

vmm http://vllm-server:8000/metrics -p 9090 -i 5
vmm http://vllm-server:8000/metrics --reset
vmm http://vllm-server:8000/metrics --retention 72 --db /data/vmm.db

Docker

# Build
docker build -t vmm .

# Run
docker run -d --network host vmm http://localhost:8000/metrics

# Or use docker compose
METRICS_URL=http://192.168.1.100:8000/metrics docker compose up -d

🏗️ Architecture

graph LR
    A[vLLM /metrics] -->|scrape every 3s| B[vmm]
    B --> C[Scraper Thread]
    C --> D[(SQLite<br/>~/.vmm/data.db)]
    B --> E[HTTP Server]
    E -->|JSON API| F[Browser<br/>Chart.js Dashboard]

📈 Monitored Metrics

Metric Source Type
Running Requests vllm:num_requests_running Gauge
Waiting Requests vllm:num_requests_waiting Gauge
KV Cache Usage vllm:kv_cache_usage_perc Gauge
Cache Hit Rate prompt_tokens_cached / prompt_tokens Derived
Requests/s vllm:request_success_total delta Counter rate
Output Tokens/s vllm:generation_tokens_total delta Counter rate
Input Tokens/s vllm:prompt_tokens_total delta Counter rate
TTFT time_to_first_token_seconds Histogram avg
ITL inter_token_latency_seconds Histogram avg
E2E Latency e2e_request_latency_seconds Histogram avg
Uptime process_start_time_seconds Gauge

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vllm_metrics_monitor-0.2.3.tar.gz (150.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vllm_metrics_monitor-0.2.3-py3-none-any.whl (28.8 kB view details)

Uploaded Python 3

File details

Details for the file vllm_metrics_monitor-0.2.3.tar.gz.

File metadata

  • Download URL: vllm_metrics_monitor-0.2.3.tar.gz
  • Upload date:
  • Size: 150.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vllm_metrics_monitor-0.2.3.tar.gz
Algorithm Hash digest
SHA256 0e37d400d8466d059a5aeeb34b47e66fad40557d508f9ed34e123cea2fe600eb
MD5 277f78f0b6583882eda71bc46dbf419d
BLAKE2b-256 ad61a026aa2cfeba31671dfcf30dd3b8492f36d86e470e318676664e84b2ca41

See more details on using hashes here.

File details

Details for the file vllm_metrics_monitor-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: vllm_metrics_monitor-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 28.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vllm_metrics_monitor-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 5ad740af82c561d0431fbe37b0ddfde7f0d75e7939a0786699f0ca8c73314de4
MD5 a5a612989e36b808611b90ccad887fdb
BLAKE2b-256 533c76b172e0725e983b4d9ce45c3dc54fa1638c0e9df70f53cc2098a522bdb6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page