SuperGradients
Project description
Easily train or fine-tune SOTA computer vision models with one open source training library
Version 3 is out! updated Notebooks and tutorials will be added this week - stay tuned!
Website • Why Use SG? • User Guide • Docs • Getting Started Notebooks • Transfer Learning • Pretrained Models • Community • License • Deci Platform
SuperGradients
Introduction
Welcome to SuperGradients, a free, open-source training library for PyTorch-based deep learning models. SuperGradients allows you to train or fine-tune SOTA pre-trained models for all the most commonly applied computer vision tasks with just one training library. We currently support object detection, image classification and semantic segmentation for videos and images.
Why use SuperGradients?
Built-in SOTA Models
Easily load and fine-tune production-ready, pre-trained SOTA models that incorporate best practices and validated hyper-parameters for achieving best-in-class accuracy.
Easily Reproduce our Results
Why do all the grind work, if we already did it for you? leverage tested and proven recipes & code examples for a wide range of computer vision models generated by our team of deep learning experts. Easily configure your own or use plug & play hyperparameters for training, dataset, and architecture.
Production Readiness and Ease of Integration
All SuperGradients models’ are production ready in the sense that they are compatible with deployment tools such as TensorRT (Nvidia) and OpenVINO (Intel) and can be easily taken into production. With a few lines of code you can easily integrate the models into your codebase.
What's New
- 【06/9/2022】 PP-LiteSeg - new pre-trained checkpoints for Cityscapes with SOTA mIoU scores (~1.5% above paper)🎯
- 【07/08/2022】DDRNet23 - new pre-trained checkpoints and recipes for Cityscapes with SOTA mIoU scores (~1% above paper)🎯
- 【27/07/2022】YOLOX models (object detection) - recipes and pre-trained checkpoints.
- 【07/07/2022】SSD Lite MobileNet V2,V1 - Training recipes and pre-trained checkpoints on COCO - Tailored for edge devices! 📱
- 【07/07/2022】 STDC - new pre-trained checkpoints and recipes for Cityscapes with super SOTA mIoU scores (~2.5% above paper)🎯
- 【16/06/2022】 ResNet50 - new pre-trained checkpoint and recipe for ImageNet top-1 score of 81.9 💪
- 【09/06/2022】 ViT models (Vision Transformer) - Training recipes and pre-trained checkpoints (ViT, BEiT).
- 【09/06/2022】 Knowledge Distillation support - training module and notebook.
- 【06/04/2022】 Integration with professional tools - Weights and Biases and DagsHub.
- 【09/03/2022】 New quick start and transfer learning example notebooks for Semantic Segmentation.
- 【07/02/2022】 We added RegSeg recipes and pre-trained models to our Semantic Segmentation models.
- 【01/02/2022】 We added issue templates for feature requests and bug reporting.
- 【20/01/2022】 STDC family - new recipes added with even higher mIoU💪
Check out SG full release notes.
Coming soon
- PP-LiteSeg recipes for Cityscapes with SOTA mIoU scores (~1.5% above paper)🎯
- Single class detectors (recipes, pre-trained checkpoints) for edge devices deployment.
- Single class segmentation (recipes, pre-trained checkpoints) for edge devices deployment.
- QAT capabilities (Quantization Aware Training).
- Dali implementation.
- Integration with more professional tools.
- Improved pre-trained checkpoints and recipes (DDRNet, ResNet, RegSeg, etc.)
Table of Content
- Transfer Learning
- Knowledge Distillation Training
- Installation Methods
- Computer Vision Models - Pretrained Checkpoints
- Implemented Model Architectures
- Training Recipes
- Contributing
- Citation
- Community
- License
- Deci Platform
Getting Started
Start Training with Just 1 Command Line
The most simple and straightforward way to start training SOTA performance models with SuperGradients reproducible recipes. Just define your dataset path and where you want your checkpoints to be saved and you are good to go from your terminal!
python -m super_gradients.train_from_recipe --config-name=imagenet_regnetY architecture=regnetY800 dataset_interface.data_dir=<YOUR_Imagenet_LOCAL_PATH> ckpt_root_dir=<CHEKPOINT_DIRECTORY>
Quickly Load Pre-Trained Weights for Your Desired Model with SOTA Performance
Want to try our pre-trained models on your machine? Import SuperGradients, initialize your Trainer, and load your desired architecture and pre-trained weights from our SOTA model zoo
# The pretrained_weights argument will load a pre-trained architecture on the provided dataset
# This is an example of loading COCO-2017 pre-trained weights for a YOLOX Nano object detection model
import super_gradients
from super_gradients.training import Trainer
trainer = SgModel(experiment_name="yoloxn_coco_experiment",ckpt_root_dir=<CHECKPOINT_DIRECTORY>)
trainer.build_model(architecture="yolox_n", arch_params={"pretrained_weights": "coco", num_classes": 80})
Quick Start Notebook - Classification
Get started with our quick start notebook for image classification tasks on Google Colab for a quick and easy start using free GPU hardware.
Classification Quick Start in Google Colab | Download notebook | View source on GitHub |
Quick Start Notebook - Semantic Segmentation
Get started with our quick start notebook for semantic segmentation tasks on Google Colab for a quick and easy start using free GPU hardware.
Segmentation Quick Start in Google Colab | Download notebook | View source on GitHub |
Transfer Learning
Transfer Learning with SG Notebook - Semantic Segmentation
Learn more about SuperGradients transfer learning or fine tuning abilities with our Citiscapes pre-trained RegSeg48 fine tuning into a sub-dataset of Supervisely example notebook on Google Colab for an easy to use tutorial using free GPU hardware
Segmentation Transfer Learning in Google Colab | Download notebook | View source on GitHub |
Knowledge Distillation Training
Knowledge Distillation Training Quick Start with SG Notebook - ResNet18 example
Knowledge Distillation is a training technique that uses a large model, teacher model, to improve the performance of a smaller model, the student model. Learn more about SuperGradients knowledge distillation training with our pre-trained BEiT base teacher model and Resnet18 student model on CIFAR10 example notebook on Google Colab for an easy to use tutorial using free GPU hardware
KD Training in Google Colab | Download notebook | View source on GitHub |
Installation Methods
Prerequisites
General requirements
- Python 3.7, 3.8 or 3.9 installed.
- torch>=1.9.0
- The python packages that are specified in requirements.txt;
To train on nvidia GPUs
- Nvidia CUDA Toolkit >= 11.2
- CuDNN >= 8.1.x
- Nvidia Driver with CUDA >= 11.2 support (≥460.x)
Quick Installation
Install using GitHub
pip install git+https://github.com/Deci-AI/super-gradients.git@stable
Computer Vision Models - Pretrained Checkpoints
Pretrained Classification PyTorch Checkpoints
Model | Dataset | Resolution | Top-1 | Top-5 | Latency (HW)*T4 | Latency (Production)**T4 | Latency (HW)*Jetson Xavier NX | Latency (Production)**Jetson Xavier NX | Latency Cascade Lake |
---|---|---|---|---|---|---|---|---|---|
ViT base | ImageNet21K | 224x224 | 84.15 | - | 4.46ms | 4.60ms | - * | - | 57.22ms |
ViT large | ImageNet21K | 224x224 | 85.64 | - | 12.81ms | 13.19ms | - * | - | 187.22ms |
BEiT | ImageNet21K | 224x224 | - | - | -ms | -ms | - * | - | -ms |
EfficientNet B0 | ImageNet | 224x224 | 77.62 | 93.49 | 0.93ms | 1.38ms | - * | - | 3.44ms |
RegNet Y200 | ImageNet | 224x224 | 70.88 | 89.35 | 0.63ms | 1.08ms | 2.16ms | 2.47ms | 2.06ms |
RegNet Y400 | ImageNet | 224x224 | 74.74 | 91.46 | 0.80ms | 1.25ms | 2.62ms | 2.91ms | 2.87ms |
RegNet Y600 | ImageNet | 224x224 | 76.18 | 92.34 | 0.77ms | 1.22ms | 2.64ms | 2.93ms | 2.39ms |
RegNet Y800 | ImageNet | 224x224 | 77.07 | 93.26 | 0.74ms | 1.19ms | 2.77ms | 3.04ms | 2.81ms |
ResNet 18 | ImageNet | 224x224 | 70.6 | 89.64 | 0.52ms | 0.95ms | 2.01ms | 2.30ms | 4.56ms |
ResNet 34 | ImageNet | 224x224 | 74.13 | 91.7 | 0.92ms | 1.34ms | 3.57ms | 3.87ms | 7.64ms |
ResNet 50 | ImageNet | 224x224 | 81.91 | 93.0 | 1.03ms | 1.44ms | 4.78ms | 5.10ms | 9.25ms |
MobileNet V3_large-150 epochs | ImageNet | 224x224 | 73.79 | 91.54 | 0.67ms | 1.11ms | 2.42ms | 2.71ms | 1.76ms |
MobileNet V3_large-300 epochs | ImageNet | 224x224 | 74.52 | 91.92 | 0.67ms | 1.11ms | 2.42ms | 2.71ms | 1.76ms |
MobileNet V3_small | ImageNet | 224x224 | 67.45 | 87.47 | 0.55ms | 0.96ms | 2.01ms * | 2.35ms | 1.06ms |
MobileNet V2_w1 | ImageNet | 224x224 | 73.08 | 91.1 | 0.46 ms | 0.89ms | 1.65ms * | 1.90ms | 1.56ms |
NOTE:
- Latency (HW)* - Hardware performance (not including IO)
- Latency (Production)** - Production Performance (including IO)
- Performance measured for T4 and Jetson Xavier NX with TensorRT, using FP16 precision and batch size 1
- Performance measured for Cascade Lake CPU with OpenVINO, using FP16 precision and batch size 1
Pretrained Object Detection PyTorch Checkpoints
Model | Dataset | Resolution | mAPval 0.5:0.95 |
Latency (HW)*T4 | Latency (Production)**T4 | Latency (HW)*Jetson Xavier NX | Latency (Production)**Jetson Xavier NX | Latency Cascade Lake |
---|---|---|---|---|---|---|---|---|
SSD lite MobileNet v2 | COCO | 320x320 | 21.5 | 0.77ms | 1.40ms | 5.28ms | 6.44ms | 4.13ms |
SSD lite MobileNet v1 | COCO | 320x320 | 24.3 | 1.55ms | 2.84ms | 8.07ms | 9.14ms | 22.76ms |
YOLOX nano | COCO | 640x640 | 26.77 | 2.47ms | 4.09ms | 11.49ms | 12.97ms | - |
YOLOX tiny | COCO | 640x640 | 37.18 | 3.16ms | 4.61ms | 15.23ms | 19.24ms | - |
YOLOX small | COCO | 640x640 | 40.47 | 3.58ms | 4.94ms | 18.88ms | 22.48ms | - |
YOLOX medium | COCO | 640x640 | 46.4 | 6.40ms | 7.65ms | 39.22ms | 44.5ms | - |
YOLOX large | COCO | 640x640 | 49.25 | 10.07ms | 11.12ms | 68.73ms | 77.01ms | - |
NOTE:
- Latency (HW)* - Hardware performance (not including IO)
- Latency (Production)** - Production Performance (including IO)
- Latency performance measured for T4 and Jetson Xavier NX with TensorRT, using FP16 precision and batch size 1
- Latency performance measured for Cascade Lake CPU with OpenVINO, using FP16 precision and batch size 1
Pretrained Semantic Segmentation PyTorch Checkpoints
Model | Dataset | Resolution | mIoU | Latency b1T4 | Latency b1T4 including IO |
---|---|---|---|---|---|
PP-LiteSeg B50 | Cityscapes | 512x1024 | 76.48 | 4.18ms | 31.22ms |
PP-LiteSeg B75 | Cityscapes | 768x1536 | 78.52 | 6.84ms | 33.69ms |
PP-LiteSeg T50 | Cityscapes | 512x1024 | 74.92 | 3.26ms | 30.33ms |
PP-LiteSeg T75 | Cityscapes | 768x1536 | 77.56 | 5.20ms | 32.28ms |
DDRNet 23 slim | Cityscapes | 1024x2048 | 78.01 | 5.74ms | 32.01ms |
DDRNet 23 | Cityscapes | 1024x2048 | 80.26 | 12.74ms | 39.01ms |
STDC 1-Seg50 | Cityscapes | 512x1024 | 75.11 | 3.34ms | 30.12ms |
STDC 1-Seg75 | Cityscapes | 768x1536 | 77.8 | 5.53ms | 32.490ms |
STDC 2-Seg50 | Cityscapes | 512x1024 | 76.44 | 4.12ms | 30.94ms |
STDC 2-Seg75 | Cityscapes | 768x1536 | 78.93 | 6.95ms | 33.89ms |
RegSeg (exp48) | Cityscapes | 1024x2048 | 78.15 | 12.03ms | 38.91ms |
Larger RegSeg (exp53) | Cityscapes | 1024x2048 | 79.2 | 22.00ms | 48.96ms |
NOTE: Performance measured on T4 GPU with TensorRT, using FP16 precision and batch size 1 (latency), and not including IO
NOTE: For resolutions below 1024x2048 we first resize the input to the inference resolution and then resize the predictions to 1024x2048. The time of resizing is included in the measurements so that the practical input-size is 1024x2048.
Implemented Model Architectures
Image Classification
- DensNet (Densely Connected Convolutional Networks) - Densely Connected Convolutional Networks https://arxiv.org/pdf/1608.06993.pdf
- DPN - Dual Path Networks https://arxiv.org/pdf/1707.01629
- EfficientNet - https://arxiv.org/abs/1905.11946
- GoogleNet - https://arxiv.org/pdf/1409.4842
- LeNet - https://yann.lecun.com/exdb/lenet/
- MobileNet - Efficient Convolutional Neural Networks for Mobile Vision Applications https://arxiv.org/pdf/1704.04861
- MobileNet v2 - https://arxiv.org/pdf/1801.04381
- MobileNet v3 - https://arxiv.org/pdf/1905.02244
- PNASNet - Progressive Neural Architecture Search Networks https://arxiv.org/pdf/1712.00559
- Pre-activation ResNet - https://arxiv.org/pdf/1603.05027
- RegNet - https://arxiv.org/pdf/2003.13678.pdf
- RepVGG - Making VGG-style ConvNets Great Again https://arxiv.org/pdf/2101.03697.pdf
- ResNet - Deep Residual Learning for Image Recognition https://arxiv.org/pdf/1512.03385
- ResNeXt - Aggregated Residual Transformations for Deep Neural Networks https://arxiv.org/pdf/1611.05431
- SENet - Squeeze-and-Excitation Networkshttps://arxiv.org/pdf/1709.01507
- ShuffleNet - https://arxiv.org/pdf/1707.01083
- ShuffleNet v2 - Efficient Convolutional Neural Network for Mobile Deviceshttps://arxiv.org/pdf/1807.11164
- VGG - Very Deep Convolutional Networks for Large-scale Image Recognition https://arxiv.org/pdf/1409.1556
Object Detection
- CSP DarkNet
- DarkNet-53
- SSD (Single Shot Detector) - https://arxiv.org/pdf/1512.02325
- YOLOX - https://arxiv.org/abs/2107.08430
Semantic Segmentation
- PP-LiteSeg - https://arxiv.org/pdf/2204.02681v1.pdf
- DDRNet (Deep Dual-resolution Networks) - https://arxiv.org/pdf/2101.06085.pdf
- LadderNet - Multi-path networks based on U-Net for medical image segmentation https://arxiv.org/pdf/1810.07810
- RegSeg - Rethink Dilated Convolution for Real-time Semantic Segmentation https://arxiv.org/pdf/2111.09957
- ShelfNet - https://arxiv.org/pdf/1811.11254
- STDC - Rethinking BiSeNet For Real-time Semantic Segmentation https://arxiv.org/pdf/2104.13188
Training Recipes
We defined recipes to ensure that anyone can reproduce our results in the most simple way.
Setup
To run recipes you first need to clone the super-gradients repository:
git clone https://github.com/Deci-AI/super-gradients
You then need to move to the root of the clone project (where you find "requirements.txt" and "setup.py") and install super-gradients:
pip install -e .
Finally, append super-gradients to the python path: (Replace "YOUR-LOCAL-PATH" with the path to the downloaded repo)
export PYTHONPATH=$PYTHONPATH:<YOUR-LOCAL-PATH>/super-gradients/
How to run a recipe
The recipes are defined in .yaml format and we use the hydra library to allow you to easily customize the parameters. The basic basic syntax is as follow:
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=<CONFIG-NAME> dataset_params.data_dir=<PATH-TO-DATASET>
Note: this script needs to be launched from the root folder of super_gradients Note: if you stored your dataset in the path specified by the recipe you can drop "dataset_params.data_dir=".
Explore our recipes
You can find all of our recipes here. You will find information about the performance of a recipe as well as the command to execute it in the header of its config file.
Example: Training of YoloX Small on Coco 2017, using 8 GPU
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=coco2017_yolox architecture=yolox_s dataset_params.data_dir=/home/coco2017
List of commands
All the commands to launch the recipes described here are listed below. Please make to "dataset_params.data_dir=" if you did not store the dataset in the path specified by the recipe (as showed in the example above).
- Classification
Cifar10
resnet:
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=cifar10_resnet +experiment_name=cifar10
ImageNet
efficientnet
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=imagenet_efficientnet
mobilenetv2
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=imagenet_mobilenetv2
mobilenetv3 small
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=imagenet_mobilenetv3_small
mobilenetv3 large
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=imagenet_mobilenetv3_large
regnetY200
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=imagenet_regnetY architecture=regnetY200
regnetY400
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=imagenet_regnetY architecture=regnetY400
regnetY600
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=imagenet_regnetY architecture=regnetY600
regnetY800
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=imagenet_regnetY architecture=regnetY800
repvgg
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=imagenet_repvgg
resnet50
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=imagenet_resnet50
resnet50_kd
python src/super_gradients/examples/train_from_kd_recipe_example/train_from_kd_recipe.py --config-name=imagenet_resnet50_kd
vit_base
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=imagenet_vit_base
vit_large
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=imagenet_vit_large
- Detection
Coco2017
ssd_lite_mobilenet_v2
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=coco2017_ssd_lite_mobilenet_v2
yolox_n
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=coco2017_yolox architecture=yolox_n
yolox_t
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=coco2017_yolox architecture=yolox_t
yolox_s
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=coco2017_yolox architecture=yolox_s
yolox_m
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=coco2017_yolox architecture=yolox_m
yolox_l
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=coco2017_yolox architecture=yolox_l
yolox_x
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=coco2017_yolox architecture=yolox_x
- Segmentation
Cityscapes
DDRNet23
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=cityscapes_ddrnet
DDRNet23-Slim
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=cityscapes_ddrnet architecture=ddrnet_23_slim
RegSeg48
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=cityscapes_regseg48
STDC1-Seg50
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=cityscapes_stdc_seg50
STDC2-Seg50
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=cityscapes_stdc_seg50 architecture=stdc2_seg
STDC1-Seg75
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=cityscapes_stdc_seg75
STDC2-Seg75
python src/super_gradients/examples/train_from_recipe_example/train_from_recipe.py --config-name=cityscapes_stdc_seg75 external_checkpoint_path=<stdc2-backbone-pretrained-path> architecture=stdc2_seg
Documentation
Check SuperGradients Docs for full documentation, user guide, and examples.
Contributing
To learn about making a contribution to SuperGradients, please see our Contribution page.
Our awesome contributors:
Made with contrib.rocks.
Citation
If you are using SuperGradients library or benchmarks in your research, please cite SuperGradients deep learning training library.
Community
If you want to be a part of SuperGradients growing community, hear about all the exciting news and updates, need help, request for advanced features, or want to file a bug or issue report, we would love to welcome you aboard!
-
Slack is the place to be and ask questions about SuperGradients and get support. Click here to join our Slack
-
To report a bug, file an issue on GitHub.
-
Join the SG Newsletter for staying up to date with new features and models, important announcements, and upcoming events.
-
For a short meeting with us, use this link and choose your preferred time.
License
This project is released under the Apache 2.0 license.
Deci Platform
Deci Platform is our end to end platform for building, optimizing and deploying deep learning models to production.
Sign up for our FREE Community Tier to enjoy immediate improvement in throughput, latency, memory footprint and model size.
Features:
- Automatically compile and quantize your models with just a few clicks (TensorRT, OpenVINO).
- Gain up to 10X improvement in throughput, latency, memory and model size.
- Easily benchmark your models’ performance on different hardware and batch sizes.
- Invite co-workers to collaborate on models and communicate your progress.
- Deci supports all common frameworks and Hardware, from Intel CPUs to Nvidia's GPUs and Jetsons.
Sign up for Deci Platform for free here
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for super_gradients-2.6.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b8013516b1ae5805135ca26aeb905ec20628b028df372466dca3084c698400b3 |
|
MD5 | 3734b404cb6c8d0bb43c82a8fdfd5905 |
|
BLAKE2b-256 | 544a916dc806922bca4f2f528c5c6694276210491ffb2ba12e7818caf6ce412f |