Skip to main content

No project description provided

Project description

codecov

Resource Monitor

This package contains utilities to monitor system resource utilization (CPU, memory, disk, network).

Here are the ways you can use it:

  • Monitor resource utilization for a compute node for a given set of resource types and process IDs.
  • Start a process and monitor its resource utilization.
  • Monitor resource utilization for a compute node asynchronously with the ability to dynamically change the resource types and process IDs being monitored.
  • Produce JSON reports of aggregated metrics.
  • Produce interactive HTML plots of the statistics.

Installation

  1. Create a Python virtual environment and activate it. Adjust as necessary if using Windows.
$ python -m venv ~/python-envs/rmon
$ source ~/python-envs/rmon/bin/activate
  1. Install the package.
$ pip install rmon
  1. Optionally, install jq by following instructions at https://jqlang.github.io/jq/download/.

Usage

CLI tool to monitor resource utilization

This command will monitor CPU, memory, and disk utilization every second and then plot the results whenever the user terminates the application.

$ rmon collect --cpu --memory --disk -i1 --plots -n run1

View the results in a table:

$ sqlite3 -table stats-output/run1.sqlite "select * from cpu"
$ sqlite3 -table stats-output/run1.sqlite "select * from memory"
$ sqlite3 -table stats-output/run1.sqlite "select * from disk"

This command will monitor CPU and memory utilization for specific process IDs and then plot the results whenever the user terminates the application.

rmon collect -i1 --plots -n run1 PID1 PID2 ...

View the results in a table:

$ sqlite3 -table stats-output/run1.sqlite "select * from process"

View min/max/avg metrics:

$ jq -s . stats-output/run1_results.json

Refer to rmon collect --help to see all options.

CLI tool to start a process and monitor its resource utilization

$ rmon monitor-process -i1 --plots -- python my_script.py ARGS [OPTIONS]

Use the same steps above to view results.

CLI tool to monitor resource utilization with dynamic changes

This command will monitor CPU, memory, and disk utilization every second. It will present user prompts that allow you to change what is being monitored. It will plot the results when you select the exit command.

rmon collect -i1 --plots -n run1 --interactive PID1 PID2 ...

You can use this asynchronous functionality in your own application if you are controlling the processes being monitored. Refer to resource_monitor/cli/collect.py for example code. Search for run_monitor_async.

Collect stats for all compute nodes in an HPC job

The directory contains some example scripts that can be deployed in a Slurm job to collect stats for your compute nodes.

  1. Copy collect_stats.sh and wait_for_stats.sh to your HPC runtime directory.

  2. Modify collect_stats.sh such that it loads the environment containing rmon.

  3. Modify collect_stats.sh with your desired options for rmon collect.

  4. Modify your sbatch script with relevant lines from batch_job.sh.

The following will occur when Slurm runs your job:

  • Ensure that the file shutdown does not exist.
  • Start rmon collect as a background operation.
  • Run your job.
  • Create the file shutdown. That will trigger collect_stats.sh to stop.
  • Gracefully shut down rmon and generate plots.

Code timings

Refer to this page for instructions on how to collect timing statistics of targeted functions.

License

rmon is released under a BSD 3-Clause license.

Software Record

This package is developed under NREL Software Record SWR-24-128.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rmon-0.6.0.tar.gz (39.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rmon-0.6.0-py3-none-any.whl (27.0 kB view details)

Uploaded Python 3

File details

Details for the file rmon-0.6.0.tar.gz.

File metadata

  • Download URL: rmon-0.6.0.tar.gz
  • Upload date:
  • Size: 39.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rmon-0.6.0.tar.gz
Algorithm Hash digest
SHA256 b2e83df8b31ec2339f51f2d97d76321ccfe055210e2a84e32784577190c60b1f
MD5 bde236a8836b6e512c25aa18be21cc35
BLAKE2b-256 276b4dbeea91c9b3711e474fc3380f89fbd049405854819b48929d8f603271df

See more details on using hashes here.

Provenance

The following attestation bundles were made for rmon-0.6.0.tar.gz:

Publisher: publish_to_pypi.yml on NatLabRockies/resource_monitor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rmon-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: rmon-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 27.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rmon-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2de7104e513069c684884bfe226be56e6c2781782c7a903ed3c49e1cdc52f68d
MD5 23e5c59ff546d3e73abc9fadcd273fd6
BLAKE2b-256 db8e53a93dad01cab4a4ce75dd2e6616e497f6db567058dbd42f0cbd292a56a6

See more details on using hashes here.

Provenance

The following attestation bundles were made for rmon-0.6.0-py3-none-any.whl:

Publisher: publish_to_pypi.yml on NatLabRockies/resource_monitor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page