A lightweight web dashboard for monitoring GPU usage
Project description
gpuview
GPU is an expensive resource, and deep learning practitioners have to monitor the
health and usage of their GPUs, such as the temperature, memory, utilization, and the users.
This can be done with tools like nvidia-smi
and gpustat
from the terminal or command-line.
Often times, however, it is not convenient to ssh
into servers to just check the GPU status.
gpuview
is meant to mitigate this by running a lightweight web dashboard on top of
gpustat
.
With gpuview
one can monitor GPUs on the go, though a web browser. Moreover, multiple GPU servers
can be registered into one gpuview
dashboard and all stats are aggregated and accessible from one place.
Thumbnail view of GPUs across multiple servers.
Setup
Python is required,gpuview
has been tested with both 2.7 and 3 versions.
Install from PyPI:
$ pip install gpuview
[or] Install directly from repo:
$ pip install git+https://github.com/fgaim/gpuview.git@master
gpuview
installs the latest version ofgpustat
frompypi
, therefore, its commands are available from the terminal.
Usage
Once gpuview
is installed, it can be started as follows:
$ gpuview run --safe-zone
This will start the dasboard at http://0.0.0.0:9988
.
By default, gpuview
listens to IP 0.0.0.0
and port 9988
, but these can be changed using --host
and --port
. The safe-zone
option implies reporting all detials including user names, but it can be turned off for security reasons.
Execute gpuview -h
to see runtime options.
run
: Startgpuview
dashboard server--host
: URL or IP address of host (default: 0.0.0.0)--port
: Port number to listen to (default: 9988)--safe-zone
: Safe to report all details, eg. usernames--exclude-self
: Don't report to others but to self-dashboard-d
,--debug
: Run server in debug mode (for developers)
add
: Add a GPU host to dashboard--url
: URL of host [IP:Port], eg. X.X.X.X:9988--name
: Optional readable name for the host, eg. Node101
remove
: Remove a registered host from dashboard--url
: URL of host to remove, eg. X.X.X.X:9988
hosts
: Print out all registered hostsservice
: Installgpuview
as system service--host
: URL or IP address of host (default: 0.0.0.0)--port
: Port number to listen to (default: 9988)--safe-zone
: Safe to report all details, eg. usernames--exclude-self
: Don't report to others but to self-dashboard
-v
,--version
: Print versions ofgpuview
andgpustat
-h
,--help
: Print help for command-line options
Install as a Service
To permanently run gpuview
it needs to be started as a background service.
This process will require a sudo
privilege authentication.
Run the following command only once to deploy as a service:
$ gpuview service [--safe-zone] [--exlude-self]
If successful, the gpuview
service is started immediately and will also autostart at boot time. It can be controlled using supervisorctl start|stop|restart gpuview
.
Monitoring multiple hosts
To aggregate the stats of multiple machines, they can be registered to one dashboard using their address and the port number running gpustat
.
Add a host as follows:
gpuview add --url <ip:port> --name <name>
Remove a registered host as follows:
gpuview remove --url <ip:port> --name <name>
Note:
gpuview
service should be started in all hosts that need to be monitored.
Tip:
gpuview
can be setup on a none GPU machine (for example, a laptop) to monitor remote servers.
etc
Helpful tips related to the underlying performance are available at the gpustat
repo.
For the sake of simplicity, gpuview
does not have a user authentication in place. As a security measure,
it does not report sensitive details such as user names by default. This can be changed if the service is
running in a trusted network, using the --safe-zone
option to report all details.
The --exclude-self
option of the run command can be used to prevent other dashboards from getting stats of the current machine. This way the stats are shown only on the host's own dashboard.
Detailed view of GPUs across multiple servers.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.