A Hyper-Automated Tuning System for Tensor Operators
Project description
Lorien: A Hyper-Automated Tuning System for Tensor Operators
Lorien is a system built on the top of TVM to massively explore/benchmark the best schedule configs of TOPI schedules.
Motivation
Although TVM already has a TOPI (TVM Operator Inventory) with the implementations of algorithm and schedules for commonly used operators such as conv2d and dense, there is a challenge makes TOPI hard to be improved efficiently.
The best schedule of TOPI is stored in TopHub, which is a JSON file in GitHub. However, it has the following problems.
-
Storing all schedules in a single text file has low accessibility and scalability. Every time AutoTVM has to load an entire JSON file in order to find only one schedule config for a workload.
-
The coverage of workloads and platforms are insufficient in the current version. For example, the latest TopHub covers only 690 workloads for CUDA backend, including conv2, depthwise conv2d, and 5 GPU models.
-
Comparing to TVM that has several commits everyday, TopHub is not frequently updated. As a result, some schedule configs are out-of-date and cannot achieve good performance anymore.
Since it is impractical to use TVM CI to benchmark the performance for every pull request, we need a separate system to regularly benchmark and update the stored schedule configs.
Commandline Interface and Example Usages
The system has a complete CLI with hierarchical commands. All commands can also be
specified in a config file in YAML format, and use a prefix "@" to expand them.
See the following examples for CLI usages, and configs/samples
for example configurations.
Note the the complete description of each command can be retrieved by the help command:
python3 -m lorien <commands> -h
- Extract workloads from a Gluon CV model.
python3 -m lorien generate extract gcv --model alexnet --target llvm
- Extract workloads from a TF model.
python3 -m lorien generate extract tf --model ./mobilenet.pb --target llvm
- Extract workloads from a Gluon CV model and mutate them to generate new workloads.
python3 -m lorien generate mutate modelzoo rules.yaml --model alexnet --target llvm
- Tune workloads with RPC servers.
# tune.yaml
rpc:
llvm -mcpu=skylake-avx512:
- localhost:18871
db:
endpoint_url:
http://localhost:10020
log-s3-bucket: saved-tuning-logs
ntrial: 3000
python3 -m lorien tune @tune.yaml @gcv_workloads_llvm.yaml
System Requirements
-
Python 3.6+
-
Amazon DynamoDB (local or aws): DynamoDB is used for storing and maintain the tuned schedules. You can choose to either 1) launch a local version and specify endpoint URL (e.g.
--db "endpoint_url: http://<your IP>:8000"
), 2) or launch an AWS service, configure AWS CLI in your machine, and specify the region name (e.g.,--db "region_name: us-west-1"
) when invoking the tuning. -
AWS S3 (optional): S3 is used to store the full tuning logs (JSON files generated by AutoTVM). This is an optional requirement, so if you did not specify
--log-s3-bucket bucket_name
, then the full tuning logs will not be uploaded but only the best schedule config will be submitted to the DynamoDB.
Documentation
TBA
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.