Skip to main content

GPU Monitoring Callbacks for TensorFlow and PyTorch Lightning

Project description

gpumonitor

Pypi Version Licence Frameworks

gpumonitor gives you stats about GPU usage during execution of your scripts and trainings, as TensorFlow or Pytorch Lightning callbacks.

Installation

Installation can be done directly from this repository:

pip install gpumonitor

Getting started

Option 1: In your scripts

monitor = gpumonitor.GPUStatMonitor(delay=1)

# Your instructions here
# [...]

monitor.stop()
monitor.display_average_stats_per_gpu()

It keeps track of the average of GPU statistics. To reset the average and start from fresh, you can also reset the monitor:

monitor = gpumonitor.GPUStatMonitor(delay=1)

# Your instructions here
# [...]

monitor.display_average_stats_per_gpu()
monitor.reset()

# Some other instructions
# [...]

monitor.display_average_stats_per_gpu()

Option 2: Callbacks

Add the following callback to your training loop:

For TensorFlow,

from gpumonitor.callbacks.tf import TFGpuMonitorCallback

model.fit(x, y, callbacks=[TFGpuMonitorCallback(delay=0.5)])

For PyTorch Lightning,

from gpumonitor.callbacks.lightning import PyTorchGpuMonitorCallback

trainer = pl.Trainer(callbacks=[PyTorchGpuMonitorCallback(delay=0.5)])
trainer.fit(model)

Display Format

You can customize the display format according to the gpustat options. For example, display of watts consumption, fan speed are available. To know which options you can change, refer to:

Sources

  • Built on top of GPUStat
  • Separate thread loop coming from gputil

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for gpumonitor, version 0.1.2
Filename, size File type Python version Upload date Hashes
Filename, size gpumonitor-0.1.2-py3-none-any.whl (5.7 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size gpumonitor-0.1.2.tar.gz (3.8 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page