Skip to main content

Lightweight monitoring toolkit for Hygon DCU clusters

Project description

hytop - monitoring tools

Quick start

Install from PyPI

Using pipx (recommended):

pipx install hytop
hytop --help

Using uv:

uv tool install hytop
hytop --help

Install from source

uv:

uv run hytop --help

pip:

pip install .
hytop --help

pipx:

pipx install .
hytop --help

Prerequisites

  • Python >= 3.10
  • Python packages: rich, typer
  • Passwordless SSH for remote

hytop

# Show the version number
hytop --version

# Specify a timeout for the subcommand
hytop --timeout 300 [COMMAND]

# 0.5-second interval and 5-second rolling window for the subcommand
hytop -n 0.5 --window 5 [COMMAND]

# Specify a list of nodes for the subcommand
hytop -H node01,node02 [COMMAND]

# Specify a list of nodes with non-standard ssh ports for the subcommand
hytop -H node01:3333,node02:3333 [COMMAND]

SSH transport

hytop uses a lightweight SSH pull model and enables SSH connection reuse by default in the core layer (applies to all subcommands using SSH collection):

  • ControlMaster=auto
  • ControlPersist=30s
  • ControlPath=~/.ssh/hytop-%C
  • ServerAliveInterval=5
  • ServerAliveCountMax=1

hytop gpu

A lightweight script for live hy-smi polling with rolling averages across multiple hosts. It features a modern terminal UI and can be used as a blocking scheduler for GPU jobs.

Usage

Simple examples:

# Local node, all GPUs
hytop gpu

# Two nodes, 0.5-second interval
hytop -H node01,node02 -n 0.5 gpu

# Exit with code 0 when all monitored GPUs are available
hytop gpu --devices 0,1 --wait-idle

# Wait for GPUs to be idle for 30 seconds before exiting
hytop gpu --devices 0,1 --wait-idle --wait-idle-seconds 30

# Wait at most 300s for availability (exit 124 on timeout)
hytop gpu --devices 0,1 --wait-idle --timeout 300

# Fine-grained columns (output order follows show-flag order)
hytop gpu --showtemp --showpower
hytop gpu --showpower --showtemp

Queue jobs in shared environments:

if hytop -H node01,node02 gpu --timeout 300 --wait-idle; then
  echo "GPUs available, starting workload..."
  # YOUR COMMAND HERE (e.g., python train.py)
else
  echo "Error: GPUs not available in time, aborting pipeline."
fi

Exit codes

Designed to be script-friendly:

  • 0: Availability condition met (GPUs are idle).
  • 124: Timeout reached before the availability condition was met.
  • 130: Interrupted by the user (Ctrl+C).
  • 2: Argument or input error.

Fine-grained metric flags

hytop gpu uses formatted hy-smi --json output and supports a subset of hy-smi --show* flags:

  • --showtemp: GPU core temperature (Temp)
  • --showpower: average package power (AvgPwr, plus AvgPwr@window)
  • --showsclk: sclk frequency (sclk)
  • --showmemuse: VRAM usage (VRAM%)
  • --showuse: GPU utilization (GPU%, plus GPU%@window)

If no --show* flags are specified, hytop defaults to: --showtemp --showpower --showsclk --showmemuse --showuse.

hytop net

Lightweight pull-based network monitor for Ethernet and InfiniBand across one or more hosts.

Usage

# Local host, auto-discover eth+ib interfaces
hytop net

# Two hosts, 0.5-second interval
hytop -H node01,node02 -n 0.5 net

# IB-only monitoring
hytop net --kind ib

# Include only selected interfaces
hytop net --ifaces eth0,mlx5_0/p1

# Stop after 60 seconds (returns 124 on timeout)
hytop --timeout 60 net

Development

Clone the repo and run make setup to create the virtual environment, install all dependencies (including dev), and configure pre-commit hooks:

make setup

Common development commands:

make format     # Auto-fix and format code (ruff)
make lint       # Check code style and errors without modifying files
make test       # Run all unit tests (pytest)
make bump part=patch  # Bump version (patch/minor/major or X.Y.Z)
make clean      # Remove build caches and the virtual environment

Version bump

Version is managed automatically via bump-my-version. Running the bump command will:

  1. Update __version__ in src/hytop/__init__.py
  2. Update current_version in pyproject.toml
  3. Create a commit (e.g., [hytop] Bump version: 0.1.1 → 0.1.2)
  4. Create a tag (e.g., hytop-0.1.2)
make bump part=patch          # 0.1.1 -> 0.1.2
make bump part=minor          # 0.1.2 -> 0.2.0
make bump part=major          # 0.2.0 -> 1.0.0
make bump part=1.2.3          # set an explicit version

Publish

Releases are automatically published to PyPI via GitHub Actions when pushing a version tag.

# 1. Bump version (auto-commits and auto-tags)
make bump part=patch

# 2. Push commits and tags to trigger GitHub Actions release
git push --follow-tags

To test building distributions locally:

make build

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hytop-0.1.7.tar.gz (48.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hytop-0.1.7-py3-none-any.whl (35.2 kB view details)

Uploaded Python 3

File details

Details for the file hytop-0.1.7.tar.gz.

File metadata

  • Download URL: hytop-0.1.7.tar.gz
  • Upload date:
  • Size: 48.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hytop-0.1.7.tar.gz
Algorithm Hash digest
SHA256 40455a6842d71c36c0805149da0c5a1c63837daa56275c13d7483fef097c7c2c
MD5 1e43d088558093e865b9c07e79b054b5
BLAKE2b-256 b67cc2cb11404daba015e354a48ae178a5983a5c9b1d86d07f14f6b7b145bd55

See more details on using hashes here.

Provenance

The following attestation bundles were made for hytop-0.1.7.tar.gz:

Publisher: hytop-publish.yml on alephpiece/hg-misc-tools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hytop-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: hytop-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 35.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hytop-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 3c50537d0a06594c747b64117509ddd6fca0e0a3f2c8a815f1db65d6118fc2f1
MD5 6ff3aecb44b33f1953374c89fd31fa8a
BLAKE2b-256 51285d9fd934d011152448ba30f84d46ddb9c3a2eaf2b4f967786bbc18a3c095

See more details on using hashes here.

Provenance

The following attestation bundles were made for hytop-0.1.7-py3-none-any.whl:

Publisher: hytop-publish.yml on alephpiece/hg-misc-tools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page