Skip to main content

A simple resource monitoring tool for SLURM jobs.

Project description

nuse

nuse is a simple resource monitoring tool for SLURM jobs. It allows users to view the CPU and memory usage logs for individual jobs and the entire node. With nuse, when running a SLURM-based workflow, you can easily monitor resource consumption through command-line tools.

nuse example

Features

  • Job-Specific Monitoring: Capture and view resource usage for each job (e.g., using cgroup filtering).
  • Node-Wide Monitoring: Automatically collect a separate log for overall node usage.
  • Command-Line Interface: Installed via pip, the nuse command lets you quickly view logs with a simple one-liner.
  • Custom Log Directory: Easily configure where logs are stored by setting the MONITOR_LOG_DIR environment variable.

Installation

Install nuse directly from PyPI:

pip install nuse

Usage

In your script include.

from nuse import start_monitoring

start_monitoring(filter_cgroup=True)

# your code

INFO: You can set filter_cgroup to False to watch the entire node and not just your own job.

NOTE: You should include start_monitoring() in the script you submit to SLRUM. Don't put it in a submitit script.

Job-Specific Log: To display the resource usage log for a specific job on a node, run in your CLI:

nuse node305 49847516

Here, node305 is the node's short name and 49847516 is the SLURM job ID.

Configuration

Log Directory:

  • By default, nuse stores logs in the directory ~/.monitoring. To change the log directory, set the environment variable before running your jobs:
export MONITOR_LOG_DIR="/path/to/your/log_directory"

How it Works

When included in your SLURM job pipeline (via start_monitoring(filter_cgroup=True) from the nuse package), nuse will:

  • Create a job-specific log file with a naming convention like cpu_memory_usage_.cluster_<SLURM_JOB_ID>.log.

Then, the nuse CLI tool uses these logs to provide an easy-to-read, real-time view of resource usage.

Happy monitoring!

Planned improvements

  • nuse clear to remove the monitoring directory. Until then just delete the ~/.monitoring folder.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nuse-0.1.8.tar.gz (78.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nuse-0.1.8-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file nuse-0.1.8.tar.gz.

File metadata

  • Download URL: nuse-0.1.8.tar.gz
  • Upload date:
  • Size: 78.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for nuse-0.1.8.tar.gz
Algorithm Hash digest
SHA256 dc5e755976906a76ffdf0644ed941221ca9efbc1d4567982992b98349c669fdd
MD5 4e5e4533a67d7af778e517954a66988a
BLAKE2b-256 5e3032854b23b872a39c0c82842b9f441c5c413ee7041ef9bbdb342ec326bd06

See more details on using hashes here.

File details

Details for the file nuse-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: nuse-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 5.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.13

File hashes

Hashes for nuse-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 a47e0371a881ff36c8fb027da91cb6c3303cdceadb4ef6b7bc919a6baebbf52a
MD5 adeadbd89d2f4a9ddcbdb3a17c9c31c8
BLAKE2b-256 36a7aafd2c63b1c1875b713b116b458eb138d7f86eda78cbbc67cd47773c80c1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page