Skip to main content

An interactive, universal accelerator and system monitor for NPU, GPU, and TPU environments.

Project description

xputop

An interactive, universal accelerator and system monitor for terminal environments, supporting Huawei Ascend NPUs, NVIDIA GPUs, AMD GPUs, Intel XPUs, Google TPUs, and Custom backends. Inspired by nvitop and nvtop.

Features

  • Universal Multi-Backend Support: Seamlessly monitors Huawei NPU, NVIDIA, AMD, Intel, and Google TPU. Supports configuring fallback or custom backends right from the configuration file using regex extractors !
  • Rich Terminal UI: Beautiful per-device cards, memory usage bars, and active process lists.
  • Fixed process row count per card to prevent UI jitter (-p N, default 3)
  • Sparkline history curves (like nvtop) for Hardware, CPU, memory, disk, and network metrics
  • Detail mode (-d) for combined chart + CPU + memory + disk view
  • Full detail mode (-a) adds network I/O monitoring and system process tree
  • Background Async Logging: Run xputop -o logs.jsonl to silently record all metrics into JSONL/CSV datasets in the background without blocking or affecting training throughput!
  • Offline HTML Visualization: Render your captured log datasets as a beautiful interactive HTML dashboard natively using xputop view logs.jsonl.
  • Auto-Kill / Memory Safety Engine: Automatically SIGKILL top memory-consuming processes when VRAM or System RAM exceeds critical thresholds (Supports top N processes, dropping to an explicit limit, or targeted killing).
  • System Telemetry: CPU per-core utilization, System memory (RAM + Swap), Disk usage paths, Network I/O.
  • Threshold-based email alerting for both accelerator and system metrics.
  • Configuration stored securely via xputop config.
  • Lightweight design — zero extra subprocesses, strict timeout handles, instantaneous CPU sampling, and zero blocking I/O buffering.

Installation

# Minimal version (Hardware probing only)
pip install xputop

# Full feature-set (Includes system-monitoring tools)
pip install "xputop[all]"

Quick Start

# Launch the interactive TUI
xputop

# Detail mode: chart + CPU + memory + disk
xputop -d

# Show with sparkline curves + CPU + memory + disk
xputop -C --cpu -m -D

# Generate an offline dashboard trace
xputop -d -o metrics.jsonl
xputop view metrics.jsonl  # Renders metrics.html for viewing offline!

# Record in CSV natively
xputop -d -o metrics.csv --output-format csv

# Control process display rows (default 3, 0=hide, -1=show all)
xputop -p 5 -C
xputop -p 0           # hide all processes

# Monitor specific disk paths
xputop -C -D /data /home

# List available probed backends
xputop --list-backends
xputop --reset-backend

# Run in demo mode (no hardware needed)
xputop --demo -d

# Print a single snapshot and exit
xputop once --cpu --mem --disk

# Print snapshot as JSON
xputop once --json --cpu --mem --disk /data

# Show Chinese help
xputop --zh

# Generate a default configuration file
xputop config --generate

Command-Line Options

Short Long Default Description
-V --version Show version and exit
-i --interval 2.0 Refresh interval in seconds
-c --config Path to configuration file
--backend auto Force specific backend (npu, nvidia, amd, etc.)
--list-backends Print probed hardware tools
-C --chart off Enable nvtop-style sparkline history curves
-l --chart-length 120 Number of history points for sparklines
-p --processes 3 Process rows per card (0=hide, -1=all)
--alert Temporary alert overrides (e.g. mem_usage_percent=95.0)
--kill 0.0 Auto-kill on memory alert. 0=disable(default), <0=Top
-o --output Setup async logger path (JSONL/CSV)
--output-format jsonl Data logger storage layout
--cpu off Show CPU per-core utilization panel
-m --mem off Show system memory panel
-D --disk off Show disk usage panel (optionally specify paths)
-d --detail off Detail mode = --chart --cpu --mem --disk
-a --detail-all off Full detail = -d + network + process tree
--demo off Demo mode with fake AI hardware
--zh Show Chinese help

Custom Hardware Support

xputop supports writing flexible standard Regex patterns direct inside ~/.config/xputop/config.toml to extract monitoring variables from any CLI binary!

[backend.custom.my_chip]
probe_command = "custom-smi --version"
info_command = "custom-smi log --csv"
device_pattern = "Device (?P<id>\\d+) \\| (?P<name>\\S+) \\| Temp=(?P<temp>\\d+) \\| Power=(?P<power>[\\d.]+) \\| Util=(?P<util>\\d+)% \\| Mem=(?P<mem_used>\\d+)/(?P<mem_total>\\d+)"

Build & Release

See BUILD.md for development setup, building, and publishing instructions.

Requirements

  • Python >= 3.8

License

Apache License 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xputop-0.1.4.tar.gz (47.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xputop-0.1.4-py3-none-any.whl (49.7 kB view details)

Uploaded Python 3

File details

Details for the file xputop-0.1.4.tar.gz.

File metadata

  • Download URL: xputop-0.1.4.tar.gz
  • Upload date:
  • Size: 47.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for xputop-0.1.4.tar.gz
Algorithm Hash digest
SHA256 1a35605344b602118cab1a2e08cf1eca81069a65ef39980c5dda62dc37ba5b00
MD5 84c1a5a0bf5a46397c7fc21e31bd5454
BLAKE2b-256 85cfe863fac4a75e7fd25da655e4b54aaa70ef3fea4237435c971a4b2f4280a2

See more details on using hashes here.

File details

Details for the file xputop-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: xputop-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 49.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for xputop-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 3c28b87899184585bb7c3fd6ed884e45de62f2f86d0c5adce109798965dd813d
MD5 6e71ef7d4ab85de347565cefc3a26ab7
BLAKE2b-256 dd21f2b6c5b63dab5c57f4a0785670173fa1a84a1dd3a433b7e9a53a997bc63f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page