Production-grade vLLM metrics monitoring TUI with persistent storage and Grafana-style visualizations

These details have not been verified by PyPI

Project links

Project description

vllmtop

Production-grade TUI dashboard for monitoring vLLM inference servers. Real-time metrics, persistent storage, and Grafana-style visualizations in your terminal.

Features

Live Dashboard - Real-time KPI cards, gauge bars, and sparklines with 60s rolling history
Historical Explorer - Time-range analysis with interactive charts
Request Breakdown - Request completion outcomes and statistics
Cache & System Analytics - KV cache usage, prefix cache hit rates, and system metrics
Configurable Alerts - Threshold-based alert rules with persistent alert history
Multiple Graph Styles - Line, braille, and block charts (press g to cycle)
Multi-Server Monitoring - Monitor multiple vLLM instances from a single dashboard
Persistent Storage - All metrics saved to SQLite with automatic retention management

Installation

Requires Python 3.11+.

pip install .

For development:

pip install -e ".[dev]"

Quick Start

# Connect to a local vLLM server (default: http://localhost:8000)
vllmtop

# Connect to a remote server
vllmtop --url http://gpu-server:8000

# Use a config file
vllmtop --config config.yaml

# Custom poll interval and retention
vllmtop --url http://localhost:8000 --interval 2 --retention 60

Configuration

Copy the example config and customize:

cp config.example.yaml config.yaml

targets:
  - url: http://localhost:8000
    name: "GPU Server 1"

graph_style: "line"       # line, braille, or block
poll_interval: 1.0        # seconds
db_path: "./vllm_metrics.db"
retention_days: 30

alert_rules:
  - name: "KV Cache Critical"
    metric: "vllm:kv_cache_usage_perc"
    operator: ">"
    threshold: 90.0
    enabled: true

See config.example.yaml for the full configuration reference.

CLI Options

Option	Default	Description
`--url`	`http://localhost:8000`	vLLM server URL
`--db`	`./vllm_metrics.db`	SQLite database path
`--retention`	`30`	Data retention in days
`--interval`	`1.0`	Poll interval in seconds
`--config`	-	Path to YAML config file
`--graph-style`	`line`	Graph style: line, braille, or block

Keyboard Shortcuts

Key	Action
`1`-`5`	Switch tabs
`g`	Cycle graph style
`s`	Screenshot
`ctrl+p`	Command palette
`q`	Quit

Metrics Tracked

Requests - Running, waiting, swapped, queue time
Cache - KV cache usage, GPU cache usage, prefix cache hit rate
Tokens - Prompt tokens, generation tokens, totals
Latency - Time-to-first-token (TTFT), time-per-output-token (TPOT), end-to-end latency
Throughput - Prompt and generation throughput (tok/s)

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Apr 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vllm_mon-0.1.0.tar.gz (18.1 MB view details)

Uploaded Apr 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vllm_mon-0.1.0-py3-none-any.whl (45.4 kB view details)

Uploaded Apr 9, 2026 Python 3

File details

Details for the file vllm_mon-0.1.0.tar.gz.

File metadata

Download URL: vllm_mon-0.1.0.tar.gz
Upload date: Apr 9, 2026
Size: 18.1 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for vllm_mon-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`2b082fb5742b118f0eae8ce4ae3d93801706593c93e258de71f3fbe1d2836d4b`
MD5	`cdca5d44f7d57c9ddb81e2058ee06776`
BLAKE2b-256	`39e1f8bce1a464a4933d4941523ca968253ca006eaf652811ca228fce71cc8e9`

See more details on using hashes here.

File details

Details for the file vllm_mon-0.1.0-py3-none-any.whl.

File metadata

Download URL: vllm_mon-0.1.0-py3-none-any.whl
Upload date: Apr 9, 2026
Size: 45.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for vllm_mon-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`13bc0e833a37804b48253ed664dd82585cd2aa2505bea778e13b53d5aa3c8193`
MD5	`05345b102d91c42e572728765e09a4bd`
BLAKE2b-256	`85f81701069546315d18409956659309d7e7e2ca30a29313e8f2355424a9f148`

See more details on using hashes here.

vllm-mon 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

vllmtop

Features

Installation

Quick Start

Configuration

CLI Options

Keyboard Shortcuts

Metrics Tracked

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes