View a SLURM cluster and inspect nodes and jobs.
Project description
Slurm Viewer
Introduction
View the status of a SLURM cluster, including nodes and queue. This application can be run on the cluster itself or any computer that can ssh into the cluster. Using it via a ssh connection, especially using a jump host can be slow.
Features:
- Overview of all nodes or just nodes in a set of partitions.
- Limit to nodes with GPUs / available GPUs.
- Show the running jobs on a selection of partitions and the jobs waiting to be scheduled.
- Show the GPU memory used over the last 4 weeks.
View the nodes in the selected partitions. View the queue of running and pending jobs. View the GPU utilization and memory usage
Installation
pip install slurm-viewer
Usage
Run slurm-viewer-init
to create a default settings file stored in ~/.config/slurm-viewer/settings.toml
.
Edit this to reflect your setup. Once you have finished run slurm-viewer
to start the UI.
Settings
The config files consist of several sections. You can add multiple slurm clusters.
[ui]
node_columns = ["node_name", "state", "gpu_tot", "gpu_alloc", "gpu_avail", "gpu_type", "gpu_mem", "cpu_tot", "cpu_alloc", "cpu_avail", "mem_tot", "mem_alloc", "mem_avail", "cpu_gpu", "mem_gpu", "cpuload", "partitions", "active_features"]
queue_columns = ["user", "job_id", "reason", "exec_host", "start_delay", "run_time", "time_limit", "command"]
priority_columns = ["user_name", "job_id", "job_priority_n", "age_n", "fair_share_n", "partition_name"]
[[clusters]]
name = "cluster_1"
partitions = ["cpu", "gpu"]
node_name_ignore_prefix = ["node"]
servers = ["cluster_1_logon_node_1", "cluster_1_logon_node_2"]
[[clusters]]
name = "cluster_2"
partitions = ["cpu-short", "cpu-medium", "cpu-long", "gpu-short", "gpu-medium", "gpu-long"]
server = "cluster_2.logon.node"
If you need to connect using a jumphost/gateway use the ~/.ssh/config
to setup the connections and use the Host
name as
the server.
Example of a ssh config:
Host gateway_1
User my_user_name
HostName gateway.somewhere
Host cluster_1
User my_user_name
HostName logonnode.somewhere
ProxyCommand ssh -W %h:%p gateway_1
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file slurm_viewer-1.0.1-py3-none-any.whl
.
File metadata
- Download URL: slurm_viewer-1.0.1-py3-none-any.whl
- Upload date:
- Size: 35.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2e42662881458701a09770a52062cb065527e30a39aac35165056f4ccf288f52 |
|
MD5 | f89491c767bbc14f7263ca57febd8ced |
|
BLAKE2b-256 | 39d1d73837c294181ea66800cfe3312339948da0fcdf47ade40f507644f5d041 |