A lightweight web dashboard for gpustat
Project description
gpuview
GPU is an expensive resources and deep learning practitioners generally have to monitor the
health and usage of their GPUs, such as the temperature, memory, utilization, and process users.
This can be done with tools like nvidia-smi and gpustat from the terminal or command-line.
However, often times, it is not convenient to ssh into servers to just check the status,
especially for a long running training that could last from hours to days. gpuview is meant
to serve this exact purpose, it is lightweight web dashboard that runs on top of
gpustat.
With gpuview one can monitor GPUs on the go through a web browser. What is more, multiple
servers can be registered into one dashboard and their stats is aggregated and accessible from
one place.
With
gpuviewyou get the latest version ofgpustatinstalled frompypi, so all the usual commands are directly available from the terminal. Seegpustat -handgpuview -hfor all command-line options.
Setup
Install from PyPI:
pip install gpuview
[or] Install directly from repo:
pip install git+https://github.com/fgaim/gpuview.git@master
Usage
Once gpuview is installed, it can be started as follows:
$ gpuview start --safe-zone
This will start the dasboard at http://0.0.0.0:9988.
By default, gpuview listens to IP 0.0.0.0 and port 9988, but these can be changed using --host and --port. The safe-zone option implies reporting all detials including user names, but it can be turned off for security reasons.
Execute gpuview -h to see runtime options.
start: Start dashboard server--host: Name or IP address of host (default: 0.0.0.0)--port: Port number to listen to (default: 9988)--safe-zone: Safe to report all details including user names--exclude-self: Don't report to others but to self dashboard-d,--debug: Run server in debug mode (for developers)
add: Add a GPU host to dashboard--url: URL of host [IP:Port], eg. X.X.X.X:9988--name: Optional readable name for the host, eg. Node101
remove: Remove a registered host from dashboard--url: URL of host to remove, eg. X.X.X.X:9988
-v,--version: Print versions ofgpuviewandgpustat-h,--help: Print help for command-line options
Run as Service
To permanently run gpuview it needs to be started as a background service. This can be done using nohup and & as follows:
sudo nohup gpuview start --safe-zone &
Better way of handling this is coming soon...
Monitoring multiple hosts
To aggregate the stats of multiple machines, they can be registered to one dashboard using their address and the port number running gpustat.
Add a host as follows:
gpuview add --url <ip:port> --name <name>
Remove a registered host as follows:
gpuview remove --url <ip:port> --name <name>
Note:
gpuviewshould be run in all hosts in addition to the controller, which by itself can be a none GPU machine.
etc
Helpful tips related to the underlying performance are available at the gpustat repo.
For the sake of similicity, gpuview does not have user authentication feature, therefore, by default it does not report sensitive details such as user and process names as security measure. However, the service is being run in a trusted network then all information can be reported using the --safe-zone option of the start command. Similarly, the --exclude-self option can be used to prevent other dashboards from getting gpuview of the current machine. This way the stats of the host are only shown on its own dashboard.
Thumbnail view of GPUs across multiple hosts.
Detailed view of GPUs across multiple hosts.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gpuview-0.1.0.tar.gz.
File metadata
- Download URL: gpuview-0.1.0.tar.gz
- Upload date:
- Size: 8.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.18.4 setuptools/40.4.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
974978727f49baf8ce5d9b251e43daa1f48732cf210da502141b859610306e66
|
|
| MD5 |
251ad8884d61b0a505317b231d08017c
|
|
| BLAKE2b-256 |
dd8b84763fd34eebe688b8ebb4657d9241ab86620ac3d11f987ca5c1d7b9ef08
|
File details
Details for the file gpuview-0.1.0-py3-none-any.whl.
File metadata
- Download URL: gpuview-0.1.0-py3-none-any.whl
- Upload date:
- Size: 11.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.18.4 setuptools/40.4.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c9d92d18cebd7cfd10831b0aade892f7472714ed251c6f2b78647481e31ecf6a
|
|
| MD5 |
611ea73e8bee3b4c69225a5dc7ae9d18
|
|
| BLAKE2b-256 |
a6d8eb9d454e94738f17163c536070013fbabf0e0d5646dfe920729bb0da6327
|