Skip to main content

A utility for managing SLURM jobs and nodes with enhanced display features.

Project description

WrapSlurm

WrapSlurm is a powerful and user-friendly wrapper for SLURM job management, designed to simplify job submission, resource querying, and log monitoring in SLURM environments. With a suite of commands like wrun, wlog, wqueue, and winfo, WrapSlurm enhances productivity for researchers and engineers working in high-performance computing (HPC) clusters.


Features

  • Simplified Job Submission (wr):

    • Automatically detect optimal resources (nodes, partitions, CPUs, memory, GPUs) based on the cluster's configuration.
    • Support for interactive and non-interactive SLURM jobs.
    • Customizable SLURM settings like time, tasks per node, and exclusions.
  • Log Monitoring (wl):

    • Watch real-time SLURM logs for specific job IDs or the latest job.
  • Queue Visualization (wq):

    • View and analyze job queues in a prettified table format with color-coded states.
  • Node Resource Querying (wi):

    • Display detailed SLURM node information, including memory, CPU, and GPU usage.

Installation

WrapSlurm is available on PyPI and can be installed using pip:

pip install wrapslurm

Post-Installation Notes

If the scripts wrun, wlog, wqueue, and winfo are installed in a directory not included in your system's PATH (e.g., ~/.local/bin), you may need to update your PATH environment variable:

  1. Add the following line to your shell configuration file (~/.bashrc or ~/.zshrc):

    export PATH="$PATH:$HOME/.local/bin"
    
  2. Reload your shell:

    source ~/.bashrc  # or source ~/.zshrc
    

Usage

1. Submit a Job (wrun)

Basic Usage:

Submit a script with auto-detected resources:

wr ./train_script.py

Specify Resources:

Submit a job with explicit resources:

wr --nodes 2 --partition gp4d --account ENT212162 --cpus-per-task 8 --memory 200G --gpus 4 ./train_script.py

Interactive Mode:

Start an interactive session:

wr

Full Help:

View all available options:

wr --help

2. Monitor Logs (wlog)

Watch the Latest Log File:

wl

Watch Logs for a Specific Job ID:

wl --job-id 12345678

3. View Job Queue (wqueue)

Display the job queue in a table format:

wqueue

4. Query Node Resources (winfo)

Basic Usage:

winfo

Include Down or Drained Nodes:

winfo --include-down

Example Workflow

  1. Query available resources:

    wi
    
  2. Submit a job:

    wr --account xxxxxx --time 2-00:00:00 ./train_script.py
    
  3. Monitor job logs:

    wl
    
  4. Check the queue:

    wq
    

Development

Cloning the Repository

git clone https://github.com/yourusername/wrapslurm.git
cd wrapslurm

Install Dependencies

Install the required Python packages:

pip install -r requirements.txt

Run Tests

Execute unit tests:

pytest

Contributing

We welcome contributions! Please follow these steps:

  1. Fork the repository.
  2. Create a feature branch:
    git checkout -b feature-name
    
  3. Commit your changes:
    git commit -m "Add feature-name"
    
  4. Push to your fork:
    git push origin feature-name
    
  5. Submit a pull request.

License

This project is licensed under the MIT License.


Acknowledgments

Special thanks to the SLURM community for making HPC resource management accessible to researchers worldwide.


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wrapslurm-0.0.7.tar.gz (8.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

WrapSlurm-0.0.7-py3-none-any.whl (10.0 kB view details)

Uploaded Python 3

File details

Details for the file wrapslurm-0.0.7.tar.gz.

File metadata

  • Download URL: wrapslurm-0.0.7.tar.gz
  • Upload date:
  • Size: 8.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.10.1

File hashes

Hashes for wrapslurm-0.0.7.tar.gz
Algorithm Hash digest
SHA256 81e26b1c343e7d028a98092af6354c72b758ede305328d5e7f92f43813b02273
MD5 6b60c09a158cdc2e8716bbc28a9193d5
BLAKE2b-256 8c89ed3816e5cc681a677accad10ae0bf0a736e0b23eaf946656e214b664f1bd

See more details on using hashes here.

File details

Details for the file WrapSlurm-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: WrapSlurm-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 10.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.10.1

File hashes

Hashes for WrapSlurm-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 f3ebc0313281dac67f0a2a07b1e30033ea4b3fe40ce0631905d239897d4a72e2
MD5 cc832f56d029514cd7b8363affecf592
BLAKE2b-256 95bb17441e9a7d5db9143354125b636e935d8888d9738a4c5e7e5999e54c5e97

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page