Skip to main content

Lightweight monitoring dashboard for vLLM inference servers

Project description

vLLM Metrics Monitor

One command to monitor your vLLM server metrics

PyPI License: MIT

vLLM Metrics Monitor (vmm) is a lightweight dashboard that scrapes Prometheus metrics from vLLM, persists to SQLite, and serves a real-time web UI. Zero external dependencies — pure Python standard library.

Dashboard Preview

🚀 Quick Start

# Install with uv
uv tool install vllm-metrics-monitor

# Or install with pip: `pip install vllm-metrics-monitor`

# Launch dashboard
vmm http://your-vllm:8000/metrics

Open http://localhost:8080 — that's it.

📖 Usage

vmm [URL] [OPTIONS]

Positional:
  URL                   vLLM Prometheus metrics endpoint
                        (default: http://localhost:8000/metrics)

Options:
  -p, --port PORT       Dashboard HTTP port (default: 8080)
  -i, --interval SEC    Scrape interval in seconds (default: 3)
  --retention HOURS     Data retention period (default: 720, i.e. 30 days)
  --db PATH             SQLite database path (default: ~/.vmm/data.db)
  --reset               Delete existing database and start fresh
  --debug               Enable debug logging

Examples

vmm http://vllm-server:8000/metrics -p 9090 -i 5
vmm http://vllm-server:8000/metrics --reset
vmm http://vllm-server:8000/metrics --retention 72 --db /data/vmm.db

Docker

# Build
docker build -t vmm .

# Run
docker run -d --network host vmm http://localhost:8000/metrics

# Or use docker compose
METRICS_URL=http://192.168.1.100:8000/metrics docker compose up -d

🏗️ Architecture

graph LR
    A[vLLM /metrics] -->|scrape every 3s| B[vmm]
    B --> C[Scraper Thread]
    C --> D[(SQLite<br/>~/.vmm/data.db)]
    B --> E[HTTP Server]
    E -->|JSON API| F[Browser<br/>Chart.js Dashboard]

📈 Monitored Metrics

Metric Source Type
Running Requests vllm:num_requests_running Gauge
Waiting Requests vllm:num_requests_waiting Gauge
KV Cache Usage vllm:kv_cache_usage_perc Gauge
Cache Hit Rate prompt_tokens_cached / prompt_tokens Derived
Requests/s vllm:request_success_total delta Counter rate
Output Tokens/s vllm:generation_tokens_total delta Counter rate
Input Tokens/s vllm:prompt_tokens_total delta Counter rate
TTFT time_to_first_token_seconds Histogram avg
ITL inter_token_latency_seconds Histogram avg
E2E Latency e2e_request_latency_seconds Histogram avg
Uptime process_start_time_seconds Gauge

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vllm_metrics_monitor-0.2.2.tar.gz (150.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vllm_metrics_monitor-0.2.2-py3-none-any.whl (28.4 kB view details)

Uploaded Python 3

File details

Details for the file vllm_metrics_monitor-0.2.2.tar.gz.

File metadata

  • Download URL: vllm_metrics_monitor-0.2.2.tar.gz
  • Upload date:
  • Size: 150.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vllm_metrics_monitor-0.2.2.tar.gz
Algorithm Hash digest
SHA256 2f831ce091fff22abdefcde2b2c20b5595f89644d54c3581c6c9f8129aa8f9a3
MD5 bfefb7b3d561064aa0fb096b16ccc765
BLAKE2b-256 56c3d731c43bf20a1c8bbc6cd41e61ff94a5cf3e8356fbf1aeb56f95c10ee41c

See more details on using hashes here.

File details

Details for the file vllm_metrics_monitor-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: vllm_metrics_monitor-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 28.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vllm_metrics_monitor-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d535e410c7c44b10a7f5f6c23161686bf93753810bbab22ab9005c16ff1025fe
MD5 532a27823fca3213104c3ae3000d6c40
BLAKE2b-256 bd57d68cc50c149c9b01b78c95a43514fe055fe1fe1ef9246e2c692627b536a4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page