Multi-Instance-GPU profiling tool
Project description
MIG Profiler
MIGProfiler is a toolkit for benchmark study on NVIDIA MIG techniques. It provides profiling on multiple deep learning training and inference tasks on MIG GPUs.
MIGProfiler is featured for:
- 🎨 Support a lot of deep learning tasks and open-sourced models on a various of benchmark type
- 📈 Present comprehensive benchmark results
- 🐣 Easy to use with a configuration file (WIP)
The project is under rapid development! Please check our benchmark website and join us!
Benchmark Website 📈
Coming soon!
Install 📦️
Manual install
Requirements:
- PyTorch with CUDA
- OpenCV
- Sanic
- Transformers
- Tqdm
- Prometheus client
# create virtual environment
conda create -n mig-perf python=3.8
conda activate mig-perf
# install required packages
conda install pytorch torchvision pytorch-cuda=11.6 -c pytorch -c nvidia
conda install -c conda-forge opencv
pip install transformers
pip install sanic tqdm prometheus_client
PyPI install
WIP
Use Docker
WIP
Quick Start 🚚
You can easily to profile on MIG GPU. Below are some common deep learning tasks to play with.
1. MIG training benchmark
We first create a 1g.10gb
MIG device
# enable MIG
sudo nvidia-smi -i 0 -mig 1
# create MIG instance
sudo nvidia-smi mig -cgi 1g.10gb -C
Start DCGM metric exporter
docker run -d --rm --gpus all --net mig_perf -p 9400:9400 \
-v "${PWD}/mig_perf/profiler/client/dcp-metrics-included.csv:/etc/dcgm-exporter/customized.csv" \
--name dcgm_exporter --cap-add SYS_ADMIN nvcr.io/nvidia/k8s/dcgm-exporter:2.4.7-2.6.11-ubuntu20.04 \
-c 500 -f /etc/dcgm-exporter/customized.csv -d f
Start to profile
cd mig_perf/profiler
export PYTHONPATH=$PWD
python train/train_cv.py --bs=32 --model=resnet50 --num_batches=500 --mig-device-id=0
Remeber to disable MIG after finish benchmark
sudo nvidia-smi -i 0 -dci
sudo nvidia-smi -i 0 -dgi
sudo nvidia-smi -i 0 -mig 0
2. MIG inference benchmark
Start DCGM metric exporter
docker run -d --rm --gpus all --net mig_perf -p 9400:9400 \
-v "${PWD}/mig_perf/profiler/client/dcp-metrics-included.csv:/etc/dcgm-exporter/customized.csv" \
--name dcgm_exporter --cap-add SYS_ADMIN nvcr.io/nvidia/k8s/dcgm-exporter:2.4.7-2.6.11-ubuntu20.04 \
-c 500 -f /etc/dcgm-exporter/customized.csv -d f
Start to profile
cd mig_perf/profiler
export PYTHONPATH=$PWD
python client/block_infernece_cv.py --bs=32 --model=resnet50 --num_batches=500 --mig-device-id=0
See more benchmark experiments in ./exp
.
3. Visualize
- in notebook
- in Prometheus (under improvement)
Cite Us 🌱
@article{zhang2022migperf,
title={MIGPerf: A Comprehensive Benchmark for Deep Learning Training and Inference Workloads on Multi-Instance GPUs},
author={Zhang, Huaizheng and Li, Yuanming and Xiao, Wencong and Huang, Yizheng and Di, Xing and Yin, Jianxiong and See, Simon and Luo, Yong and Lau, Chiew Tong and You, Yang},
journal={arXiv preprint arXiv:2301.00407},
year={2023}
}
Contributors 👥
- Yuanming Li
- Huaizheng Zhang
- Yizheng Huang
- Xing Di
Ackowledgement
Special thanks to Aliyun and NVIDIA AI Tech Center to provide MIG GPU server for benchmarking.
License
This repository is open-sourced under MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file migperf-0.0.1.tar.gz
.
File metadata
- Download URL: migperf-0.0.1.tar.gz
- Upload date:
- Size: 18.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c37811fba86cd4169d0e9dfa99969e92128317997241d9bab328129affafd0c9 |
|
MD5 | 28f959fe451e2dc152d626e68fe35622 |
|
BLAKE2b-256 | 586423168cab60b1adeb9478ef62728b265da7185e4caeb56177fb6ddadeb4bb |
File details
Details for the file migperf-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: migperf-0.0.1-py3-none-any.whl
- Upload date:
- Size: 20.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9fc00bd5f8a3a1bdea9bc83f193fe297cc5db844f529b807e306fdbc4351e926 |
|
MD5 | 458022a864c2d7307e8295cc2356a0a9 |
|
BLAKE2b-256 | 3999049d80a490dabcdce2384994241ffd34edb204d4347b281d7bbe43ce477a |