Skip to main content

A simple tool for monitoring and displaying GPU stats

Project description

gpulink

A library and command-line tool for monitoring NVIDIA GPU stats.
gpulink uses pynvml - a Python wrapper for the NVIDIA Management Library (NVML).

Current status

⚠ This project is in a very early state and under heavy development - breaking changes between versions are possible ⚠

Requirements

gpulink requires the NVIDIA Management Library to be installed which is shipped together with nvidia-smi.

Installation

Installation using PIP

To install gpulink using the Python Package Manager (PIP) run:
pip install gpulink

Using from source

gpulink can also be used from source. For this, perform the following steps to create a Python environment and to install the requirements:

  1. Create an environment: python -m venv env
  2. Activate the environment: .\env\Scripts\Activate
  3. Install requirements: pip install -r requirements.txt

Command-line usage

gpulink can either be imported as a library or can be used from the command line:

Usage: GPU-Link: Monitor NVIDIA GPUs [OPTIONS] COMMAND [ARGS]...

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  record   Record GPU properties.
  sensors  Fetch and print the GPU sensor status.

Examples

  • View GPU sensor status: gpulink sensors
  • Watch GPU sensor status: gpulink sensors -w
  • Record the memory usage over time, generate a plot and save it as a png image: gpulink record -o memory.png memory

Library usage

gpulink can be simply used from within Python. Just import gpulink and create a DeviceCtx. This context manages device access and provides an API for fetching GPU properties (see API example):

import gpulink as gpu

with gpu.DeviceCtx() as ctx:
   print(f"Available GPUs: {ctx.gpus.names}")
   memory_information = ctx.get_memory_info(gpus=ctx.gpus.ids)

Recording data

gpulink provides a Recorder class for recording GPU properties. For simple instantiation use one of the provided factory methods, e.g.:

    recorder = gpu.Recorder.create_memory_recorder(ctx, ctx.gpus.ids)
    recorder.start()
    ... # Do some GPU stuff
    recorder.stop(auto_join=True)

Once a recording is finished its data can be accessed:

recording = recording = recorder.get_recording()

Plotting data

gpulink provides a Plot class for visualizing recordings using matplotlib:

    from pathlib import Path
    
    # Generate the plot
    plot = gpu.Plot(recording)
    
    # Display the plot
    plot.plot()
    
    # Save the plot as an image
    plot.save(Path("memory.png"))
    
    # The generated Figure and Axis can also be accessed directly
    figure, axis = plot.generate_graph()

The plot can be parametrized using the PlotOptions dataclass. An example using custom plot options is given in Basic example

Unit testing

When using gpulink inside unit tests, create or use an already existing device mock, e.g. DeviceMock. Then during creating a DeviceCtx provide the mock as follows:

import gpulink as gpu

with gpu.DeviceCtx(device=DeviceMock) as ctx:
   ...

Troubleshooting

  • If you get the error message below, please ensure that the NVIDIA Management Library is installed on you system by typing nvidia-smi --version into a terminal:
    pynvml.nvml.NVMLError_LibraryNotFound: NVML Shared Library Not Found.

Planned features

  • Live-plotting of GPU stats

Changelog

  • 0.4.0
    • Recording arbitrary GPU stats (clock, fan-speed, memory, power-usage, temp)
    • Display GPU name and power usage within sensors command
    • Replaced arparse library by click
    • Aborting a watch or recording command can be done by pressing any instead of ctrl+c

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gpulink-0.4.0.tar.gz (18.4 kB view hashes)

Uploaded Source

Built Distribution

gpulink-0.4.0-py3-none-any.whl (18.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page