A Python Package for Automatically Monitoring & Occupying NVIDIA GPUs
Project description
A Python Package for Automatically Monitoring & Occupying NVIDIA GPUs
GPU4U
locates all GPUs on the computer, determines their availablity and returns a ordered list of available GPUs. Availablity is based upon the current memory consumption and load of each GPU. The package is written with GPU selection for Deep Learning in mind, but it is not task/library specific and it can be applied to any task, where it may be useful to identify available GPUs.
Requirements
NVIDIA GPU with latest NVIDIA driver installed. GPU4U
uses the program nvidia-smi
to get the GPU status of all NVIDIA GPUs. nvidia-smi should be installed automatically, when you install your NVIDIA driver.
Python libraris:
- os
- random
- re
- sys
- time
- datetime
- pynvml
Tested on CUDA driver version 450.102.04 with Python 3.6.10.
Installation
With PIP
pip install gpu4u
With Source Code
git clone https://github.com/imrdong/gpu4u.git
cd gpu4u
python setup.py install
Usage
To combine GPU4U
with your Python code, all you have to do is
- Open a terminal in a folder other than the
GPU4U
folder - Start a python console by typing
python
in the terminal - In the newly opened python console, type:
>>> from gpu4u import auto_monitor
>>> auto_monitor(script="the_script_you_want_to_run")
The outputs of GPU4U
depending on your number of GPUs and their current usage, see Demo for more details.
Demo
Script Running with Available GPUs
# Find available GPUs
Find Available GPU: 0, 1, 2, 3. Start Running Your Script.
# Random select one GPU to run your script
Script: CUDA_VISIBLE_DEVICES=1 python train.py --batch_size 64
# The start time of script run
Started @: 2021-02-22 13:08:23
Script Running with No Available GPUs
# No available GPUs, start automatic monitoring with waiting time prompt
No Available GPU for Now, Automatic Monitoring for 0:23:10
# Find available GPUs
Find Available GPU: 2, 3. Start Running Your Script.
# Random select one GPU to run your script
Script: CUDA_VISIBLE_DEVICES=3 python train.py --batch_size 64
# The start time of script run
Started @: 2021-02-22 13:31:33
License
See LICENSE
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.