Skip to main content

No project description provided

Project description

Resource Monitor

This package contains utilities to monitor system resource utilization (CPU, memory, disk, network).

Here are the ways you can use it:

  • Monitor resource utilization for a compute node for a given set of resource types and process IDs.
  • Start a process and monitor its resource utilization.
  • Monitor resource utilization for a compute node asynchronously with the ability to dynamically change the resource types and process IDs being monitored.
  • Produce JSON reports of aggregated metrics.
  • Produce interactive HTML plots of the statistics.

Installation

  1. Create a Python virtual environment and activate it. Adjust as necessary if using Windows.
$ python -m venv ~/python-envs/rmon
$ source ~/python-envs/rmon/bin/activate
  1. Install the package.
$ pip install rmon
  1. Optionally, install jq by following instructions at https://jqlang.github.io/jq/download/.

Usage

CLI tool to monitor resource utilization

This command will monitor CPU, memory, and disk utilization every second and then plot the results whenever the user terminates the application.

$ rmon collect --cpu --memory --disk -i1 --plots -n run1

View the results in a table:

$ sqlite3 -table stats-output/run1.sqlite "select * from cpu"
$ sqlite3 -table stats-output/run1.sqlite "select * from memory"
$ sqlite3 -table stats-output/run1.sqlite "select * from disk"

This command will monitor CPU and memory utilization for specific process IDs and then plot the results whenever the user terminates the application.

rmon collect -i1 --plots -n run1 PID1 PID2 ...

View the results in a table:

$ sqlite3 -table stats-output/run1.sqlite "select * from process"

View min/max/avg metrics:

$ jq -s . stats-output/run1_results.json

Refer to rmon collect --help to see all options.

CLI tool to start a process and monitor its resource utilization

$ rmon monitor-process -i1 --plots python my_script.py ARGS [OPTIONS]

Use the stame steps above to view results.

CLI tool to monitor resource utilization with dynamic changes

This command will monitor CPU, memory, and disk utilization every second. It will present user prompts that allow you to change what is being monitored. It will plot the results when you select the exit command.

rmon collect -i1 --plots -n run1 --interactive PID1 PID2 ...

You can use this asynchronous functionality in your own application if you are controlling the processes being monitored. Refer to resource_monitor/cli/collect.py for example code. Search for run_monitor_async.

Collect stats for all compute nodes in an HPC job

The directory contains some example scripts that can be deployed in a Slurm job to collect stats for your compute nodes.

  1. Copy collect_stats.sh and wait_for_stats.sh to your HPC runtime directory.

  2. Modify collect_stats.sh such that it loads the environment containing rmon.

  3. Modify collect_stats.sh with your desired options for rmon collect.

  4. Modify your sbatch script with relevant lines from batch_job.sh.

The following will occur when Slurm runs your job:

  • Ensure that the file shutdown does not exist.
  • Start rmon collect as a background operation.
  • Run your job.
  • Create the file shutdown. That will trigger collect_stats.sh to stop.
  • Gracefully shut down rmon and generate plots.

Code timings

Refer to this page for instructions on how to collect timing statistics of targeted functions.

License

rmon is released under a BSD 3-Clause license.

Software Record

This package is developed under NREL Software Record SWR-24-128.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rmon-0.3.0.tar.gz (28.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rmon-0.3.0-py3-none-any.whl (26.7 kB view details)

Uploaded Python 3

File details

Details for the file rmon-0.3.0.tar.gz.

File metadata

  • Download URL: rmon-0.3.0.tar.gz
  • Upload date:
  • Size: 28.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for rmon-0.3.0.tar.gz
Algorithm Hash digest
SHA256 8f641c2fdc01d9843de8895f109889e9b9a14dc16afc38817c3c7d9aaa3ab021
MD5 a792a59ff8284b4ec3ca27dc69be82ad
BLAKE2b-256 9bbd2ec74f036939983e7a9955572ce89a939c5072d897f1f348d8a98d9b4feb

See more details on using hashes here.

Provenance

The following attestation bundles were made for rmon-0.3.0.tar.gz:

Publisher: publish_to_pypi.yml on NREL/resource_monitor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rmon-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: rmon-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 26.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for rmon-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 54808bff8cb8fecc248a277ddd26411548aa222f7215b5129a1bf1f95c7447d2
MD5 1f6bfd25b12b7fcf8899c7b1632408d3
BLAKE2b-256 dec96eabc61f4faa475ef81eed80f516027a470e2ac50bbb65601886926dae47

See more details on using hashes here.

Provenance

The following attestation bundles were made for rmon-0.3.0-py3-none-any.whl:

Publisher: publish_to_pypi.yml on NREL/resource_monitor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page