Skip to main content

A Hyper-Automated Tuning System for Tensor Operators

Project description

Lorien: A Hyper-Automated Tuning System for Tensor Operators

Build Status

Lorien is a system built on the top of TVM to massively explore/benchmark the best schedule configs of TOPI schedules.

Motivation

Although TVM already has a TOPI (TVM Operator Inventory) with the implementations of algorithm and schedules for commonly used operators such as conv2d and dense, there is a challenge makes TOPI hard to be improved efficiently.

The best schedule of TOPI is stored in TopHub, which is a JSON file in GitHub. However, it has the following problems.

  1. Storing all schedules in a single text file has low accessibility and scalability. Every time AutoTVM has to load an entire JSON file in order to find only one schedule config for a workload.

  2. The coverage of workloads and platforms are insufficient in the current version. For example, the latest TopHub covers only 690 workloads for CUDA backend, including conv2, depthwise conv2d, and 5 GPU models.

  3. Comparing to TVM that has several commits everyday, TopHub is not frequently updated. As a result, some schedule configs are out-of-date and cannot achieve good performance anymore.

Since it is impractical to use TVM CI to benchmark the performance for every pull request, we need a separate system to regularly benchmark and update the stored schedule configs.

Commandline Interface and Example Usages

The system has a complete CLI with hierarchical commands. All commands can also be specified in a config file in YAML format, and use a prefix "@" to expand them. See the following examples for CLI usages, and configs/samples for example configurations. Note the the complete description of each command can be retrieved by the help command:

python3 -m lorien <commands> -h
  • Extract workloads from a Gluon CV model.
python3 -m lorien generate extract gcv --model alexnet --target llvm
  • Extract workloads from a TF model.
python3 -m lorien generate extract tf --model ./mobilenet.pb --target llvm
  • Extract workloads from a Gluon CV model and mutate them to generate new workloads.
python3 -m lorien generate mutate modelzoo rules.yaml --model alexnet --target llvm
  • Tune workloads with RPC servers.
# tune.yaml
rpc:
    llvm -mcpu=skylake-avx512:
      - localhost:18871
db:
    endpoint_url:
        http://localhost:10020
log-s3-bucket: saved-tuning-logs
ntrial: 3000
python3 -m lorien tune @tune.yaml @gcv_workloads_llvm.yaml

System Requirements

  • Python 3.6+

  • Amazon DynamoDB (local or aws): DynamoDB is used for storing and maintain the tuned schedules. You can choose to either 1) launch a local version and specify endpoint URL (e.g. --db "endpoint_url: http://<your IP>:8000"), 2) or launch an AWS service, configure AWS CLI in your machine, and specify the region name (e.g., --db "region_name: us-west-1") when invoking the tuning.

  • AWS S3 (optional): S3 is used to store the full tuning logs (JSON files generated by AutoTVM). This is an optional requirement, so if you did not specify --log-s3-bucket bucket_name, then the full tuning logs will not be uploaded but only the best schedule config will be submitted to the DynamoDB.

Documentation

TBA

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lorien-0.0.1.tar.gz (34.6 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page