Utilities for HPC clusters including PBS/Slurm job statistics and a psutil-based (CPU/GPU/Memory) resource monitor.

These details have not been verified by PyPI

Project description

hpc-stats-scripts

Utilities for working with high-performance computing (HPC) environments. The scripts help inspect PBS/Slurm job efficiency and monitor CPU/GPU and memory usage on a running system or process tree.

Made with Codex :)

Dependencies

Install the required Python packages with pip:

Feature	Packages	Install command
Core utilities	psutil	`pip install psutil`
Plotting for `psutil-monitor`	matplotlib, numpy	`pip install matplotlib numpy`
GPU monitoring for `psutil-monitor --gpu`	nvidia-ml-py3 (pynvml)	`pip install nvidia-ml-py3`
Plot + GPU combo	psutil, matplotlib, numpy, nvidia-ml-py3	`pip install psutil matplotlib numpy nvidia-ml-py3`
All extras via pip extras	plot + GPU	`pip install .[all]`

The pbs-bulk-user-stats command also expects the PBS qstat utility to be available in your environment. The slurm-bulk-user-stats command expects Slurm's sacct utility to be available in your environment.

Installation

Clone the repository and install with pip:

Install from PyPI (once published):
```bash
pip install hpc-stats-scripts
pip install hpc-stats-scripts[plot]
pip install hpc-stats-scripts[gpu]
pip install hpc-stats-scripts[all]

Or install directly from GitHub (the latest version):

pip install "hpc-stats-scripts @ git+https://github.com/avnikonenko/hpc-stats-scripts.git"
pip install "hpc-stats-scripts[plot] @ git+https://github.com/avnikonenko/hpc-stats-scripts.git"
pip install "hpc-stats-scripts[gpu] @ git+https://github.com/avnikonenko/hpc-stats-scripts.git"
pip install "hpc-stats-scripts[all] @ git+https://github.com/avnikonenko/hpc-stats-scripts.git"

The base installation depends on psutil. The plot extra pulls in matplotlib and numpy for the --plot feature of psutil-monitor. The gpu extra installs nvidia-ml-py3 to enable --gpu.

CLI tools

`pbs-bulk-user-stats`

Summarize CPU and memory usage for PBS jobs and show which nodes the jobs are allocated to. The command relies on qstat being available in your PATH. The table now includes NGPUS (requested GPUs) when present.

Examples:

# Summarize a specific job and write CSV output
pbs-bulk-user-stats --job 12345 --csv stats.csv

# Summarize all jobs for the current user (default) 
pbs-bulk-user-stats --include-finished

# Summarize all jobs for a specific user
pbs-bulk-user-stats --user myuser --include-finished

# Include finished jobs but only keep 20 finished entries
pbs-bulk-user-stats --user myuser --include-finished --finished-limit 20

# Faster best-effort mode: fetch active jobs + only N finished job details (use if default option takes too long)
pbs-bulk-user-stats --user myuser --include-finished --finished-limit 20 --finished-limit-strategy fetch

When invoked with no --user or --job options:

On a login node (no $PBS_JOBID present), it summarizes all jobs for the current user.
Inside a running PBS job (where $PBS_JOBID is set), it automatically summarizes that specific job.

pbs-bulk-user-stats

State codes (PBS):

R running, Q queued/waiting, X finished (requires --include-finished), other codes are printed under “other” in the summary.

Expected output (CPU/RAM only):

$ pbs-bulk-user-stats

JOBID    STATE   NAME       NODES    NCPUS  WALL(h)  CPUT(h)  avgCPU  CPUeff  memUsed   memReq   memEff
-------------------------------------------------------------------------------------------------------
0001      R      run1		pbs-1    176    38.55    3632.12  163.6  93.53%  207.4 GiB 256.00 GiB 81.10%
0002      R      run2		pbs-2    176    38.59    3589.72  93.13  52.91%  50.02 GiB 256.00 GiB 19.54%
...
Summary:
  jobs:         5
  unique nodes: 3
  states:       R=4  Q=1  X=0  other=0
  mean CPUeff:  75.20%
  mean avgCPU:  132.35
  mean memEff:  82.50%
  max memUsed:  230.16 GiB

or if run inside a running PBS:

JOBID  STATE  NAME   NODES  NCPUS  WALL(h)  CPUT(h)  avgCPU  CPUeff  memUsed     memReq     memEff
-----------------------------------------------------------------------------------------------------
0001   R      STDIN  pbs-5  100    0.03     0.01     0.22    0.22%   666.58 MiB  30.00 GiB  2.17% 

Summary:
  jobs:        1
  mean CPUeff: 0.22%
  mean avgCPU: 0.22
  mean memEff: 2.17%
  max memUsed: 666.58 MiB

After the table, a summary reports the job count, mean CPU efficiency, mean average CPU usage, mean memory efficiency, and the peak memory used across all listed jobs.

`psutil-monitor`

Real-time CPU and memory monitor for the system or a process tree. Use --gpu to also report aggregate GPU utilization and memory via NVML (requires nvidia-ml-py3). When --csv/--plot are used, metrics stream live to the terminal during the run; CSV/PNG files are written when the monitor exits (Ctrl+C, duration reached, or proc tree ends).

Example output files (generated with --plot and --csv):

Plot (CPU + GPU stacked):
CSV: docs/psutil-monitor-example.csv

GPU output fields (when --gpu is used):

GPU util: Average utilization across visible GPUs.
busyGPUs: Sum of utilization fractions (e.g., two GPUs at 50% each → 1.0).
GPU mem %: Aggregate GPU memory usage percentage.
Per-GPU (CSV gpu_pergpu): index:util%/used/total for each device.

Examples:

# System-wide (by default) monitoring with console output only
psutil-monitor

# System-wide monitoring with CSV and PNG output
psutil-monitor --mode system --csv node.csv --plot node.png

# Monitor the current process tree (useful inside a PBS job)
psutil-monitor --mode proc --pid $$ --include-children --csv job.csv

# For script.py resources monitoring:
python script.py &                   # launch the workload
target=$!                            # PID of script.py
echo $target
# psutil-monitor watches that PID and exits when the process tree is gone
psutil-monitor --mode proc --pid "$target" --include-children --csv stat.csv --plot plot.png

Expected output:

$ psutil-monitor

CPUs available (affinity): 384
Total memory available: 754.76 GiB
CPU basis for %: 384
Memory basis for %: 754.76 GiB
2025-08-14T15:20:14  CPU  79.67%  busyCPUs 305.93  (provided 384)  MEM   9.93%  used 74.96 GiB / total 754.76 GiB
2025-08-14T15:20:16  CPU  69.30%  busyCPUs 266.13  (provided 384)  MEM   9.95%  used 75.12 GiB / total 754.76 GiB
2025-08-14T15:20:18  CPU  61.34%  busyCPUs 235.53  (provided 384)  MEM  10.05%  used 75.82 GiB / total 754.76 GiB
2025-08-14T15:20:20  CPU  61.32%  busyCPUs 235.47  (provided 384)  MEM  10.09%  used 76.15 GiB / total 754.76 GiB
2025-08-14T15:20:22  CPU  74.57%  busyCPUs 286.33  (provided 384)  MEM   9.94%  used 74.99 GiB / total 754.76 GiB
2025-08-14T15:20:24  CPU  85.94%  busyCPUs 330.01  (provided 384)  MEM   9.86%  used 74.44 GiB / total 754.76 GiB
Average busy CPUs over run: 276.570
Peak memory (system): 76.15 GiB

With GPUs (--gpu and NVIDIA GPUs present):

$ psutil-monitor --gpu

CPUs available (affinity): 96
Total memory available: 503.70 GiB
CPU basis for %: 96
Memory basis for %: 503.70 GiB
GPUs detected (NVML): 4
2026-02-03T10:00:14  CPU  45.12%  busyCPUs 43.32  (provided 96)  MEM   8.10%  used 40.80 GiB / total 503.70 GiB  GPU util  57.5% busyGPUs 2.30 mem  42.0%
2026-02-03T10:00:16  CPU  48.33%  busyCPUs 46.39  (provided 96)  MEM   8.20%  used 41.30 GiB / total 503.70 GiB  GPU util  63.0% busyGPUs 2.52 mem  44.1%
2026-02-03T10:00:18  CPU  52.10%  busyCPUs 49.99  (provided 96)  MEM   8.25%  used 41.60 GiB / total 503.70 GiB  GPU util  68.7% busyGPUs 2.75 mem  45.3%
Average busy CPUs over run: 46.567
Average busy GPUs over run: 2.523
Peak memory (system): 41.60 GiB

Use the --help option of each command to see all available options.

`slurm-bulk-user-stats`

Summarize CPU and memory usage for Slurm jobs and show which nodes the jobs are allocated to. The command relies on sacct being available in your PATH. The table includes NGPUS based on AllocTres/AllocGRES when present. If TRES GPU usage metrics are available, the summary also reports mean GPU util and GPU hours (used/requested).

State codes (Slurm):

R/RUNNING, PD/PENDING, CD/COMPLETED; other states (e.g., F, CG, S, TO) are grouped under “other” in the summary and listed in the breakdown.

Examples:

# Summarize a specific job and write CSV output
slurm-bulk-user-stats --job 12345 --csv stats.csv

# Summarize all running jobs for the current user (default)
slurm-bulk-user-stats

# Summarize all jobs (including finished) for a specific user
slurm-bulk-user-stats --user myuser --include-finished

# Include finished jobs but only keep 20 finished entries
slurm-bulk-user-stats --user myuser --include-finished --finished-limit 20

# Faster best-effort mode: fetch active jobs + only N finished job details
slurm-bulk-user-stats --user myuser --include-finished --finished-limit 20 --finished-limit-strategy fetch

When invoked with no --user or --job options:

On a login node (no $SLURM_JOB_ID present), it summarizes pending/running jobs for the current user.
Inside a running Slurm job (where $SLURM_JOB_ID is set), it automatically summarizes that specific job.

slurm-bulk-user-stats

The output mirrors the PBS version, showing job state, node list, CPU/memory usage, efficiency metrics, and a summary block with job counts and averages.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.3.3

Mar 19, 2026

1.3.2

Feb 13, 2026

1.3.1

Feb 12, 2026

This version

1.3

Feb 12, 2026

1.2

Feb 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hpc_stats_scripts-1.3.tar.gz (27.9 kB view details)

Uploaded Feb 12, 2026 Source

File details

Details for the file hpc_stats_scripts-1.3.tar.gz.

File metadata

Download URL: hpc_stats_scripts-1.3.tar.gz
Upload date: Feb 12, 2026
Size: 27.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for hpc_stats_scripts-1.3.tar.gz
Algorithm	Hash digest
SHA256	`da17686ed80bd67c2f09a040f8889d274fab0d69b4ca41c23acd35a872759f51`
MD5	`d11e6205b945c40e111f6ca129997074`
BLAKE2b-256	`09c6b4abd2e21b6d31c96680d196afd20725e7ace605661eb472ee3094bbedc8`

See more details on using hashes here.

hpc-stats-scripts 1.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

hpc-stats-scripts

Dependencies

Installation

CLI tools

`pbs-bulk-user-stats`

`psutil-monitor`

`slurm-bulk-user-stats`

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes