Skip to main content

No project description provided

Project description

codecov

Resource Monitor

This package contains utilities to monitor system resource utilization (CPU, memory, disk, network).

Here are the ways you can use it:

  • Monitor resource utilization for a compute node for a given set of resource types and process IDs.
  • Start a process and monitor its resource utilization.
  • Monitor resource utilization for a compute node asynchronously with the ability to dynamically change the resource types and process IDs being monitored.
  • Produce JSON reports of aggregated metrics.
  • Produce interactive HTML plots of the statistics.

Installation

  1. Create a Python virtual environment and activate it. Adjust as necessary if using Windows.
$ python -m venv ~/python-envs/rmon
$ source ~/python-envs/rmon/bin/activate
  1. Install the package.
$ pip install rmon
  1. Optionally, install jq by following instructions at https://jqlang.github.io/jq/download/.

Usage

CLI tool to monitor resource utilization

This command will monitor CPU, memory, and disk utilization every second and then plot the results whenever the user terminates the application.

$ rmon collect --cpu --memory --disk -i1 --plots -n run1

View the results in a table:

$ sqlite3 -table stats-output/run1.sqlite "select * from cpu"
$ sqlite3 -table stats-output/run1.sqlite "select * from memory"
$ sqlite3 -table stats-output/run1.sqlite "select * from disk"

This command will monitor CPU and memory utilization for specific process IDs and then plot the results whenever the user terminates the application.

rmon collect -i1 --plots -n run1 PID1 PID2 ...

View the results in a table:

$ sqlite3 -table stats-output/run1.sqlite "select * from process"

View min/max/avg metrics:

$ jq -s . stats-output/run1_results.json

Refer to rmon collect --help to see all options.

CLI tool to start a process and monitor its resource utilization

$ rmon monitor-process -i1 --plots python my_script.py ARGS [OPTIONS]

Use the stame steps above to view results.

CLI tool to monitor resource utilization with dynamic changes

This command will monitor CPU, memory, and disk utilization every second. It will present user prompts that allow you to change what is being monitored. It will plot the results when you select the exit command.

rmon collect -i1 --plots -n run1 --interactive PID1 PID2 ...

You can use this asynchronous functionality in your own application if you are controlling the processes being monitored. Refer to resource_monitor/cli/collect.py for example code. Search for run_monitor_async.

Collect stats for all compute nodes in an HPC job

The directory contains some example scripts that can be deployed in a Slurm job to collect stats for your compute nodes.

  1. Copy collect_stats.sh and wait_for_stats.sh to your HPC runtime directory.

  2. Modify collect_stats.sh such that it loads the environment containing rmon.

  3. Modify collect_stats.sh with your desired options for rmon collect.

  4. Modify your sbatch script with relevant lines from batch_job.sh.

The following will occur when Slurm runs your job:

  • Ensure that the file shutdown does not exist.
  • Start rmon collect as a background operation.
  • Run your job.
  • Create the file shutdown. That will trigger collect_stats.sh to stop.
  • Gracefully shut down rmon and generate plots.

Code timings

Refer to this page for instructions on how to collect timing statistics of targeted functions.

License

rmon is released under a BSD 3-Clause license.

Software Record

This package is developed under NREL Software Record SWR-24-128.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rmon-0.5.0.tar.gz (39.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rmon-0.5.0-py3-none-any.whl (27.0 kB view details)

Uploaded Python 3

File details

Details for the file rmon-0.5.0.tar.gz.

File metadata

  • Download URL: rmon-0.5.0.tar.gz
  • Upload date:
  • Size: 39.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rmon-0.5.0.tar.gz
Algorithm Hash digest
SHA256 86146b27fee51ebc7f79cbcd5ac526910080c6498b4fad179e814394963780b1
MD5 868f40bf982c8091378d43a52ab8770a
BLAKE2b-256 a532feb232611a3aa39aa5fd1777777e063e6891243219798b7e455d86fdd9cd

See more details on using hashes here.

Provenance

The following attestation bundles were made for rmon-0.5.0.tar.gz:

Publisher: publish_to_pypi.yml on NREL/resource_monitor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rmon-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: rmon-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 27.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rmon-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0cd241479ddde31c7e7bfa32fae5ead9e16b28aa0996a2e54acc74400a980e81
MD5 a4b0b9887e6b83247122c54e8d840da8
BLAKE2b-256 27bf2f24bdd15e26ad1e81817acb03cc316d81a3e73f42477a5cbc71af9b7c4d

See more details on using hashes here.

Provenance

The following attestation bundles were made for rmon-0.5.0-py3-none-any.whl:

Publisher: publish_to_pypi.yml on NREL/resource_monitor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page