Skip to main content

A simple tool for monitoring and displaying GPU stats

Project description

gpulink

Downloads PythonTest

A library and command-line tool for monitoring NVIDIA GPU stats.
gpulink uses pynvml - a Python wrapper for the NVIDIA Management Library (NVML).

Current status

⚠ gpulink is in a very early state - breaking changes between versions are possible!

Requirements

gpulink requires the NVIDIA Management Library to be installed which is shipped together with nvidia-smi.

Installation

Installation using PIP

To install gpulink using the Python Package Manager (PIP) run:
pip install gpulink

Using from source

gpulink can also be used from source. For this, perform the following steps to create a Python environment and to install the requirements:

  1. Create an environment: python -m venv env
  2. Activate the environment: .\env\Scripts\Activate
  3. Install requirements: pip install -r requirements.txt

Command-line usage

gpulink can either be imported as a library or can be used from the command line:

Usage: GPU-Link: Monitor NVIDIA GPUs [OPTIONS] COMMAND [ARGS]...

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  record   Record GPU properties.
  sensors  Fetch and print the GPU sensor status.

Examples

  • View GPU sensor status: gpulink sensors
  • Watch GPU sensor status: gpulink sensors -w
  • Record the memory usage over time, generate a plot and save it as a png image: gpulink record -o memory.png memory

Library usage

gpulink can be easily used within applications. Just import gpulink and create a DeviceCtx. This context manages device access and provides an API for fetching GPU properties (see API example):

import gpulink as gpu

with gpu.DeviceCtx() as ctx:
   print(f"Available GPUs: {ctx.gpus.names}")
   memory_information = ctx.get_memory_info(gpus=ctx.gpus.ids)

Recording data

gpulink provides a Recorder class for recording GPU properties. For simple instantiation use one of the provided factory methods, e.g.:

recorder = gpu.Recorder.create_memory_recorder(ctx, ctx.gpus.ids)

Afterwards a recording can be performed:

Option 1: Using start and stop method (see Basic example)

    recorder.start()
    ... # Do some GPU stuff
    recorder.stop(auto_join=True)

Option 2: Using a context manager (see Context-Manager example)

    with recorder:
    ... # Do some GPU stuff

Option 3: Using a decorator (see Decorator example)

    @record(factory=gpu.Recorder.create_memory_recorder)
    def my_gpu_function():
    ... # Do dome GPU stuff
    
    my_gpu_function()

Once a recording is finished its data can be accessed:

recording = recording = recorder.get_recording()

Plotting data

gpulink provides a Plot class for visualizing recordings using matplotlib:

    from pathlib import Path
    
    # Generate the plot
    plot = gpu.Plot(recording)
    
    # Display the plot
    plot.plot()
    
    # Save the plot as an image
    plot.save(Path("memory.png"))
    
    # The generated Figure and Axis can also be accessed directly
    figure, axis = plot.generate_graph()

Unit testing

When using gpulink inside unit tests, create or use an already existing device mock, e.g. DeviceMock. To create a custom mock class just derive it from the BaseDevice. Then during creating a DeviceCtx provide the mock as follows:

import gpulink as gpu

with gpu.DeviceCtx(device=DeviceMock) as ctx:
   ...

Troubleshooting

  • If you get the error message below, please ensure that the NVIDIA Management Library is installed on you system by typing nvidia-smi --version into a terminal:
    pynvml.nvml.NVMLError_LibraryNotFound: NVML Shared Library Not Found.

Planned features

  • Live-plotting of GPU stats

Changelog

  • 0.4.0
    • Recording arbitrary GPU stats (clock, fan-speed, memory, power-usage, temp)
    • Display GPU name and power usage within sensors command
    • Replaced arparse library by click
    • Aborting a watch or recording command can be done by pressing any key instead of ctrl+c
  • 0.4.1
    • Fix error when calling nvmlDeviceGetName in pynvml version 11.5.0
  • 0.5.0
    • Add context-manager-based recording
    • Add decorator-based recording
  • 0.6.0
    • Remove PlotOptions class
    • Fix imports and update unit tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gpulink-0.6.0.tar.gz (20.5 kB view details)

Uploaded Source

Built Distribution

gpulink-0.6.0-py3-none-any.whl (20.4 kB view details)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page