Skip to main content

View a SLURM cluster and inspect nodes and jobs.

Project description

Slurm Viewer

Python Version from PEP 621 TOML Gitlab Pipeline Status PyPI - License PyPI - Status PyPI - Version PyPI - Format Pepy Total Downloads

Introduction

View the status of a SLURM cluster, including nodes and queue. This application can be run on the cluster itself or any computer that can ssh into the cluster. Using it via a ssh connection, especially using a jump host can be slow.

Features:

  • Overview of all nodes or just nodes in a set of partitions.
  • Limit to nodes with GPUs / available GPUs.
  • Show the running jobs on a selection of partitions and the jobs waiting to be scheduled.
  • Show the GPU memory used over the last 4 weeks.

View the nodes in the selected partitions. Slurmviewer Nodes View the queue of running and pending jobs. Slurmviewer Queue View the GPU utilization and memory usage Slurmviewer SPU

Installation

pip install slurm-viewer

Usage

Run slurm-viewer-init to create a default settings file stored in ~/.config/slurm-viewer/settings.toml. Edit this to reflect your setup. Once you have finished run slurm-viewer to start the UI.

Settings

The config files consist of several sections. You can add multiple slurm clusters.

[ui]
node_columns = ["node_name", "state", "gpu_tot", "gpu_alloc", "gpu_avail", "gpu_type", "gpu_mem", "cpu_tot", "cpu_alloc", "cpu_avail", "mem_tot", "mem_alloc", "mem_avail", "cpu_gpu", "mem_gpu", "cpuload", "partitions", "active_features"]
queue_columns = ["user", "job_id", "reason", "exec_host", "start_delay", "run_time", "time_limit", "command"]
priority_columns = ["user_name", "job_id", "job_priority_n", "age_n", "fair_share_n", "partition_name"]

[[clusters]]
name = "cluster_1"
partitions = ["cpu", "gpu"]
node_name_ignore_prefix = ["node"]
servers = ["cluster_1_logon_node_1", "cluster_1_logon_node_2"]

[[clusters]]
name = "cluster_2"
partitions = ["cpu-short", "cpu-medium", "cpu-long", "gpu-short", "gpu-medium", "gpu-long"]
server = "cluster_2.logon.node"

If you need to connect using a jumphost/gateway use the ~/.ssh/config to setup the connections and use the Host name as the server.

Example of a ssh config:

Host gateway_1
  User my_user_name
  HostName gateway.somewhere
  
Host cluster_1
  User my_user_name
  HostName logonnode.somewhere
  ProxyCommand ssh -W %h:%p gateway_1

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

slurm_viewer-1.0.1-py3-none-any.whl (35.7 kB view details)

Uploaded Python 3

File details

Details for the file slurm_viewer-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: slurm_viewer-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 35.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.16

File hashes

Hashes for slurm_viewer-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2e42662881458701a09770a52062cb065527e30a39aac35165056f4ccf288f52
MD5 f89491c767bbc14f7263ca57febd8ced
BLAKE2b-256 39d1d73837c294181ea66800cfe3312339948da0fcdf47ade40f507644f5d041

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page