DyNAS-T (Dynamic Neural Architecture Search Toolkit) - a SuperNet NAS optimization package
Project description
DyNAS-T
DyNAS-T (Dynamic Neural Architecture Search Toolkit) is a super-network neural architecture search NAS optimization package designed for efficiently discovering optimal deep neural network (DNN) architectures for a variety of performance objectives such as accuracy, latency, multiply-and-accumulates, and model size.
Background
Neural architecture search, the study of automating the discovery of optimal deep neural network architectures for tasks in domains such as computer vision and natural language processing, has seen rapid growth in the machine learning research community. The computational overhead of evaluating DNN architectures during the neural architecture search process can be very costly due to the training and validation cycles. To address the training overhead, novel weight-sharing approaches known as one-shot or super-networks [1] have offered a way to mitigate the training overhead by reducing training times from thousands to a few GPU days. These approaches train a task-specific super-network architecture with a weight-sharing mechanism that allows the sub-networks to be treated as unique individual architectures. This enables sub-network model extraction and validation without a separate training cycle.
To learn more about super-networks and how to define/train them, please see our super-network tutorial.
Algorithms
Evolutionary algorithms, specifically genetic algorithms, have a history of usage in NAS and continue to gain popularity as a highly efficient way to explore the architecture objective space. DyNAS-T supports a wide range of evolutionary algorithms (EAs) such as NSGA-II [2] by leveraging the pymoo library.
A unique capability of DyNAS-T is the Lightweight Iterative NAS (LINAS) that pairs evolutionary algorithms with lightly trained objective predictors in an iterative cycle to accelerate architectural exploration [3]. This technique is ~4x more sample efficient than typical one-shot predictor-based NAS approaches.
The following number of optimization algorithms are supported by DyNAS-T in both standard and LINAS formats.
1 Objective (Single-Objective) |
2 Objectives (Multi-Objective) |
3 Objectives (Many-Objective) |
---|---|---|
GA* 'ga' |
NSGA-II* 'nsga2' |
UNSGA-II* 'unsga3' |
CMA-ES 'cmaes' |
AGE-MOEA 'age' |
CTAEA 'ctaea' |
MOEAD 'moead' |
||
*Recommended for stability of search results |
Super-networks
DyNAS-T included support for the following super-network frameworks suchs as Once-for-All (OFA).
Super-Network | Model Name | Dataset | Objectives/Measurements Supported |
---|---|---|---|
OFA MobileNetV3-w1.0 | ofa_mbv3_d234_e346_k357_w1.0 |
ImageNet 1K | accuracy_top1 , macs , params , latency |
OFA MobileNetV3-w1.2 | ofa_mbv3_d234_e346_k357_w1.2 |
ImageNet 1K | accuracy_top1 , macs , params , latency |
OFA ResNet50 | ofa_resnet50 |
ImageNet 1K | accuracy_top1 , macs , params , latency |
Quantization-aware OFA ResNet50 | inc_quantization_ofa_resnet50 |
ImageNet 1K | accuracy_top1 , model_size , params , latency |
OFA ProxylessNAS | ofa_proxyless_d234_e346_k357_w1.3 |
ImageNet 1K | accuracy_top1 , macs , params , latency |
TransformerLT | transformer_lt_wmt_en_de |
WMT En-De | bleu (BLEU Score), macs , params , latency |
BERT-SST2 | bert_base_sst2 |
SST2 | latency , macs , params , accuracy_sst2 |
BootstrapNAS | - | - | accuracy_top1 , macs , params , latency |
Vision Transformer | vit_base_imagenet |
ImageNet 1K | accuracy_top1 , macs , params , latency |
ImageNet: When using any of the OFA super-networks, the ImageNet directory tree should have a separate directory for each of the classes in both
train
andval
sets. To prepare your ImageNet dataset for use with OFA you could follow instructions available here. WMT En-De: To obtain and prepare dataset please follow instructions available here. BootstrapNAS: BootstrapNAS is currently only avaiable through the Python interface. To read more how to use DyNAS-T on BootstrapNAS search space, please refer to the example notebook.
Intel Library Support
The following software libraries are compatible with DyNAS-T:
Getting Started
To setup DyNAS-T from source code run pip install -e .
or make a local copy of the dynast
subfolder in your
local subnetwork repository with the requirements.txt
dependencies installed.
You can also install DyNAS-T from PyPI:
pip install dynast
Installing DyNAS-T with pip
will make a dynast
command available in your CLI.
Running DyNAS-T
The dynast/cli.py
(you can use dynast
command to invoke this script) template provide a starting point for running the NAS process. An evaluation is the process of determining the fitness of an architectural candidate. A validation evaluation is the costly process of running the full validation set. A predictor evaluation uses a pre-trained performance predictor.
supernet
- Name of the pre-trained super-network. See list of supported super-networks. For a custom super-network, you will have to modify the code including thedynast_manager.py
andsupernetwork_registry.py
files.optimization_metrics
- These are the metrics that the NAS process optimizes for. Note that the number of objectives you specify must be compatible with the supporting algorithm.measurements
- In addition to the optimization metrics, you can specify which measurements you would like to take during an full evaluation.search_tactic
-linas
Lightweight iterative NAS (recommended) orevolutionary
(good for benchmarking and testing new super-networks).search_algo
- Determines which evolutionary algorithm to run for thelinas
low-fidelity inner loop or theevolutionary
search tactic.num_evals
- Number of evaluations (full validation measurements) to take. For example, if 1 validation measurement takes 5 minutes, 120 evaluations would take 10 hours.seed
- Random seed.population
- The size of the pool of candidates for each evolutionary generation. 50 is recommended for most cases, though this can be treated as a tunable hyperparameter.results_path
- The location of the csv file that store information of the DNN candidates during the search process. The csv file is used for plotting NAS results.dataset_path
- Location of the dataset used for training the super-network of interest.
Single-Objective
Example 1a. NAS process for the OFA MobileNetV3-w1.0 super-network that optimizes for ImageNet Top-1 accuracy using a simple evolutionary genetic algorithm (GA) approach.
dynast \
--supernet ofa_mbv3_d234_e346_k357_w1.0 \
--optimization_metrics accuracy_top1 \
--measurements accuracy_top1 macs params \
--results_path mbnv3w10_ga_acc.csv \
--search_tactic evolutionary \
--num_evals 250 \
--search_algo ga
Example 1b. NAS process for the OFA MobileNetV3-w1.2 super-network that optimizes for ImageNet Top-1 accuracy using a LINAS + GA approach.
dynast \
--supernet ofa_mbv3_d234_e346_k357_w1.2 \
--optimization_metrics accuracy_top1 \
--measurements accuracy_top1 macs params \
--results_path mbnv3w12_linasga_acc.csv \
--search_tactic linas \
--num_evals 250 \
--search_algo ga
Multi-Objective
Example 2a. NAS process for the OFA MobileNetV3-w1.0 super-network that optimizes for ImageNet Top-1 accuracy and multiply-and-accumulates (MACs) using a LINAS+NSGA-II approach.
dynast \
--supernet ofa_mbv3_d234_e346_k357_w1.0 \
--optimization_metrics accuracy_top1 macs \
--measurements accuracy_top1 macs params \
--results_path mbnv3w10_linasnsga2_acc_macs.csv \
--search_tactic evolutionary \
--num_evals 250 \
--search_algo nsga2
Example 2b. NAS process for the OFA ResNet50 super-network that optimizes for ImageNet Top-1 accuracy and model size (parameters) using a evolutionary AGE-MOEA approach.
dynast \
--supernet ofa_resnet50 \
--optimization_metrics accuracy_top1 params \
--measurements accuracy_top1 macs params \
--results_path resnet50_age_acc_params.csv \
--search_tactic evolutionary \
--num_evals 500 \
--search_algo age
Many-Objective
Example 3a. NAS process for the OFA ResNet50 super-network that optimizes for ImageNet Top-1 accuracy and model size (parameters) and multiply-and-accumulates (MACs) using a evolutionary unsga3 approach.
dynast \
--supernet ofa_resnet50 \
--optimization_metrics accuracy_top1 macs params \
--measurements accuracy_top1 macs params \
--results_path resnet50_linasunsga3_acc_macs_params.csv \
--search_tactic evolutionary \
--num_evals 500 \
--search_algo unsga3
Example 3b. NAS process for the OFA MobileNetV3-w1.0 super-network that optimizes for ImageNet Top-1 accuracy and model size (parameters) and multiply-and-accumulates (MACs) using a linas+unsga3 approach.
dynast \
--supernet ofa_mbv3_d234_e346_k357_w1.0 \
--optimization_metrics accuracy_top1 macs params \
--measurements accuracy_top1 macs params \
--results_path mbnv3w10_linasunsga3_acc_macs_params.csv \
--search_tactic linas \
--num_evals 500 \
--search_algo unsga3
An example of the search results for a Multi-Objective search using both LINAS+NSGA-II and standard NSGA-II algorithms will yield results in the following format.
Quantization-aware Search
This approach allows you to run search on your FP32 super-network and find optimal model configurations w.r.t. both architecture and Post-Training Quantization policy. DyNAS-T's implementation uses Intel® Neural Compressor as an underlying backend for quantizing models. This search approach is specific to the CPU, and so --device=cpu
has to be used.
Example 4. Quantization-aware search on OFA ResNet50 super-network.
dynast \
--results_path dynast_ofaresnet50_quant.csv \
--dataset_path /ML_datasets/imagenet/ilsvrc12_raw \
--supernet inc_quantization_ofa_resnet50 \
--device cpu \
--batch_size 128 \
--search_tactic linas \
--measurements latency accuracy_top1 \
--optimization_metrics latency accuracy_top1 \
--seed 42
Distributed Search
Search can be performed with multiple workers using the MPI
/ torch.distributed
library. To use this functionality, your script should be called with mpirun
/mpiexec
command and an additional --distributed
param has to be set (DyNAS([...], distributed=True
).
Note: When run with
torchrun
, unless explicitly specified,torch.distributed
usesOMP_NUM_THREADS=1
(link) which may result in slow evaluation time. Good practice is to explicitly setOMP_NUM_THREADS
to(total_core_count)/(num_workers)
(optional for MPI).
Example 5. Distributed NAS process with two OpenMPI workers for the OFA MobileNetV3-w1.0 super-network that optimizes for ImageNet Top-1 accuracy and model size (parameters)
OMP_NUM_THREADS=28 mpirun \
--report-bindings \
-x MASTER_ADDR=127.0.0.1 \
-x MASTER_PORT=1234 \
-np 2 \
-bind-to socket \
-map-by socket \
dynast \
--supernet ofa_mbv3_d234_e346_k357_w1.0 \
--optimization_metrics accuracy_top1 macs \
--results_path results.csv \
--search_tactic linas \
--distributed \
--population 50 \
--num_evals 250
References
[1] Cai, H., Gan, C., & Han, S. (2020). Once for All: Train One Network and Specialize it for Efficient Deployment. ArXiv, abs/1908.09791.
[2] K. Deb, A. Pratap, S. Agarwal and T. Meyarivan, "A fast and elitist multiobjective genetic algorithm: NSGA-II," in IEEE Transactions on Evolutionary Computation, vol. 6, no. 2, pp. 182-197, April 2002, doi: 10.1109/4235.996017.
[3] Cummings, D., Sarah, A., Sridhar, S.N., Szankin, M., Muñoz, J.P., & Sundaresan, S. (2022). A Hardware-Aware Framework for Accelerating Neural Architecture Search Across Modalities. ArXiv, abs/2205.10358.
Legal Disclaimer and Notices
This “research quality code” is for Non-Commercial purposes provided by Intel “As Is” without any express or implied warranty of any kind. Please see the dataset's applicable license for terms and conditions. Intel does not own the rights to this data set and does not confer any rights to it. Intel does not warrant or assume responsibility for the accuracy or completeness of any information, text, graphics, links or other items within the code. A thorough security review has not been performed on this code. Additionally, this repository may contain components that are out of date or contain known security vulnerabilities. ImageNet, WMT, SST2: Please see the dataset's applicable license for terms and conditions. Intel does not own the rights to this data set and does not confer any rights to it.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dynast-1.6.0.tar.gz
.
File metadata
- Download URL: dynast-1.6.0.tar.gz
- Upload date:
- Size: 144.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | be60bfb88b4234f5531f8c6afa6ece16a377d6b0b07a802857e28d174dc62b01 |
|
MD5 | eb5a58fcc21da7e57a11ae0ec34fb5e0 |
|
BLAKE2b-256 | dddc5927b88a1af3a505466fb59aed1337c65ae5ecb15a28002e61454258ed13 |
File details
Details for the file dynast-1.6.0-py3-none-any.whl
.
File metadata
- Download URL: dynast-1.6.0-py3-none-any.whl
- Upload date:
- Size: 217.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7645563735cfca70880dde436bd1045df09d1e1f29d60a305be3095ae60cc845 |
|
MD5 | c7299c88972f008d475b3888f97b1c8f |
|
BLAKE2b-256 | df3818bb7f4bc3124d2b86b8050951d3e1f077aae842b0c166be472a0cd76aaf |