A utility for managing SLURM jobs and nodes with enhanced display features.
Project description
WrapSlurm
WrapSlurm is a powerful and user-friendly wrapper for SLURM job management, designed to simplify job submission, resource querying, and log monitoring in SLURM environments. With a suite of commands like wrun, wlog, wqueue, and winfo, WrapSlurm enhances productivity for researchers and engineers working in high-performance computing (HPC) clusters.
Features
-
Simplified Job Submission (
wr):- Automatically detect optimal resources (nodes, partitions, CPUs, memory, GPUs) based on the cluster's configuration.
- Support for interactive and non-interactive SLURM jobs.
- Customizable SLURM settings like time, tasks per node, and exclusions.
-
Log Monitoring (
wl):- Watch real-time SLURM logs for specific job IDs or the latest job.
-
Queue Visualization (
wq):- View and analyze job queues in a prettified table format with color-coded states.
-
Node Resource Querying (
wi):- Display detailed SLURM node information, including memory, CPU, and GPU usage.
Installation
WrapSlurm is available on PyPI and can be installed using pip:
pip install wrapslurm
Post-Installation Notes
If the scripts wrun, wlog, wqueue, and winfo are installed in a directory not included in your system's PATH (e.g., ~/.local/bin), you may need to update your PATH environment variable:
-
Add the following line to your shell configuration file (
~/.bashrcor~/.zshrc):export PATH="$PATH:$HOME/.local/bin"
-
Reload your shell:
source ~/.bashrc # or source ~/.zshrc
Usage
1. Submit a Job (wrun)
Basic Usage:
Submit a script with auto-detected resources:
wr ./train_script.py
Specify Resources:
Submit a job with explicit resources:
wr --nodes 2 --partition gp4d --account ENT212162 --cpus-per-task 8 --memory 200G --gpus 4 ./train_script.py
Interactive Mode:
Start an interactive session:
wr
Full Help:
View all available options:
wr --help
2. Monitor Logs (wlog)
Watch the Latest Log File:
wl
Watch Logs for a Specific Job ID:
wl --job-id 12345678
3. View Job Queue (wqueue)
Display the job queue in a table format:
wqueue
4. Query Node Resources (winfo)
Basic Usage:
winfo
Include Down or Drained Nodes:
winfo --include-down
Example Workflow
-
Query available resources:
wi
-
Submit a job:
wr --account xxxxxx --time 2-00:00:00 ./train_script.py
-
Monitor job logs:
wl
-
Check the queue:
wq
Development
Cloning the Repository
git clone https://github.com/yourusername/wrapslurm.git
cd wrapslurm
Install Dependencies
Install the required Python packages:
pip install -r requirements.txt
Run Tests
Execute unit tests:
pytest
Contributing
We welcome contributions! Please follow these steps:
- Fork the repository.
- Create a feature branch:
git checkout -b feature-name
- Commit your changes:
git commit -m "Add feature-name"
- Push to your fork:
git push origin feature-name
- Submit a pull request.
License
This project is licensed under the MIT License.
Acknowledgments
Special thanks to the SLURM community for making HPC resource management accessible to researchers worldwide.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wrapslurm-0.0.7.tar.gz.
File metadata
- Download URL: wrapslurm-0.0.7.tar.gz
- Upload date:
- Size: 8.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.10.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
81e26b1c343e7d028a98092af6354c72b758ede305328d5e7f92f43813b02273
|
|
| MD5 |
6b60c09a158cdc2e8716bbc28a9193d5
|
|
| BLAKE2b-256 |
8c89ed3816e5cc681a677accad10ae0bf0a736e0b23eaf946656e214b664f1bd
|
File details
Details for the file WrapSlurm-0.0.7-py3-none-any.whl.
File metadata
- Download URL: WrapSlurm-0.0.7-py3-none-any.whl
- Upload date:
- Size: 10.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.10.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f3ebc0313281dac67f0a2a07b1e30033ea4b3fe40ce0631905d239897d4a72e2
|
|
| MD5 |
cc832f56d029514cd7b8363affecf592
|
|
| BLAKE2b-256 |
95bb17441e9a7d5db9143354125b636e935d8888d9738a4c5e7e5999e54c5e97
|