Skip to main content

TurnkeyML Tools and Models

Project description

Welcome to ONNX TurnkeyML

Turnkey tests Build API tests OS - Windows | Linux Made with Python

We are on a mission to understand and use as many models as possible while leveraging the right toolchain and AI hardware for the job in every scenario.

Evaluating a deep learning model with a familiar toolchain and hardware accelerator is pretty straightforward. Scaling these evaluations to get apples-to-apples insights across a landscape of millions of permutations of models, toolchains, and hardware targets is not straightforward. Not without help, anyways.

TurnkeyML is a tools framework that integrates models, toolchains, and hardware backends to make evaluation and actuation of this landscape as simple as turning a key.

Get started

For most users its as simple as:

pip install turnkeyml
turnkey my_model.py

The installation guide, tutorials, and user guide have everything you need to know.

Use Cases

TurnkeyML is designed to support the following use cases. Of course, it is also quite flexible, so we are sure you will come up with some use cases of your own too.

Use Case Description Recipe
ONNX Model Zoo Export thousands of ONNX files across different ONNX opsets. This is how we generated the contents of the new ONNX Model Zoo. turnkey */*.py -b --onnx-opset 16
turnkey */*.py -b --onnx-opset 17
Performance validation Measure latency and throughput in hardware across devices and runtimes to understand product-market fit. turnkey model.py --runtime ort
turnkey model.py --runtime torch-eager
turnkey cache report
Functional coverage Measure the functional coverage of toolchain/hardware combinations over a large corpus of models (e.g., how many models are supported by a novel compiler?). turnkey transformers/*.py --sequence MY_COMPILER
turnkey cache report
Stress testing Run millions of inferences across thousands of models and log all the results to find the bugs in a HW/SW stack. turnkey timm/*.py --iterations 1000 --device MY_DEVICE --runtime MY_RUNTIME
Model insights Analyze a model to learn its parameter count, input shapes, which ONNX ops it uses, etc. turnkey model.py
turnkey cache stats MY_BUILD

Demo

Let's say you have a Python script that includes a PyTorch model. Maybe you downloaded the model from Huggingface, grabbed it from our corpus, or wrote it yourself. Doesn't matter, just call turnkey and get to work.

The turnkey CLI will analyze your script, find the model(s), run an ONNX toolchain on the model, and execute the resulting ONNX file in CPU hardware:

> turnkey bert.py
Models discovered during profiling:

bert.py:
        model (executed 1x)
                Model Type:     Pytorch (torch.nn.Module)
                Class:          BertModel (<class 'transformers.models.bert.modeling_bert.BertModel'>)
                Location:       /home/jfowers/turnkeyml/models/transformers/bert.py, line 23
                Parameters:     109,482,240 (417.64 MB)
                Input Shape:    'attention_mask': (1, 128), 'input_ids': (1, 128)
                Hash:           bf722986
                Build dir:      /home/jfowers/.cache/turnkey/bert_bf722986
                Status:         Successfully benchmarked on AMD Ryzen 9 7940HS w/ Radeon 780M Graphics (ort v1.15.1) 
                                Mean Latency:   44.168  milliseconds (ms)
                                Throughput:     22.6    inferences per second (IPS)

Let's say you want a fp16 ONNX file of the same model: incorporate the ONNX ML Tools fp16 converter tool into the build sequence, and the Build dir will contain the ONNX file you seek:

> turnkey bert.py --sequence optimize-fp16 --build-only
bert.py:
        model (executed 1x)
                ...
                Build dir:      /home/jfowers/.cache/turnkey/bert_bf722986
                Status:         Model successfully built!
> ls /home/jfowers/.cache/turnkey/bert_bf722986/onnx

bert_bf722986-op14-base.onnx  bert_bf722986-op14-opt-f16.onnx  bert_bf722986-op14-opt.onnx

Now you want to see the fp16 model running on your Nvidia GPU with the Nvidia TensorRT runtime:

> turnkey bert.py --sequence export optimize-fp16 --device nvidia --runtime tensorrt
bert.py:
        model (executed 1x)
                ...
                Status:         Successfully benchmarked on NVIDIA GeForce RTX 4070 Laptop GPU (trt v23.09-py3) 
                                Mean Latency:   2.573   milliseconds (ms)
                                Throughput:     377.8   inferences per second (IPS)

Mad with power, you want to see dozens of fp16 Transformers running on your Nvidia GPU:

> turnkey REPO_ROOT/models/transformers/*.py --sequence optimize-fp16 --device nvidia --runtime tensorrt
Models discovered during profiling:

albert.py:
        model (executed 1x)
                Class:          AlbertModel (<class 'transformers.models.albert.modeling_albert.AlbertModel'>)
                Parameters:     11,683,584 (44.57 MB)
                Status:         Successfully benchmarked on NVIDIA GeForce RTX 4070 Laptop GPU (trt v23.09-py3) 
                                Mean Latency:   1.143   milliseconds (ms)
                                Throughput:     828.3   inferences per second (IPS)

bart.py:
        model (executed 1x)
                Class:          BartModel (<class 'transformers.models.bart.modeling_bart.BartModel'>)
                Parameters:     139,420,416 (531.85 MB)
                Status:         Successfully benchmarked on NVIDIA GeForce RTX 4070 Laptop GPU (trt v23.09-py3) 
                                Mean Latency:   2.343   milliseconds (ms)
                                Throughput:     414.5   inferences per second (IPS)

bert.py:
        model (executed 1x)
                Class:          BertModel (<class 'transformers.models.bert.modeling_bert.BertModel'>)
                Parameters:     109,482,240 (417.64 MB)
                Status:         Successfully benchmarked on NVIDIA GeForce RTX 4070 Laptop GPU (trt v23.09-py3) 
                                Mean Latency:   2.565   milliseconds (ms)
                                Throughput:     378.0   inferences per second (IPS)

...

Finally, you want to visualize the results in one place so that your boss can see how productive you've been. This command will collect all of the statistics across all prior commands into a single spreadsheet.

> turnkey cache report

Summary spreadsheet saved at /home/jfowers/2023-11-30.csv

You're probably starting to get the idea :rocket:

There's a lot more features you can learn about in the tutorials and user guide.

What's Inside

The TurnkeyML framework has 5 core components:

  • Analysis tool: Inspect Python scripts to find the PyTorch models within. Discover insights and pass the models to the other tools.
  • Build tool: Prepare your model using industry-standard AI tools (e.g., exporters, optimizers, quantizers, and compilers). Any model-to-model transformation is fair game.
  • Runtime tool: Invoke AI runtimes (e.g., ONNX Runtime, TensorRT, etc.) to execute models in hardware and measure key performance indicators.
  • Reporting tool: Visualize statistics about the models, builds, and invocations.
  • Models corpus: Hundreds of popular PyTorch models that are ready for use with turnkey.

All of this is seamlessly integrated together such that a command like turnkey repo/models/corpus/script.py gets you all of the functionality in one shot. Or you can access functionality piecemeal with commands and APIs like turnkey analyze script.py or build_model(my_model_instance). The tutorials show off the individual features.

You can read more about the code organization here.

Extensibility

Models

transformers graph_convolutions torch_hub torchvision timm

This repository is home to a diverse corpus of hundreds of models. We are actively working on increasing the number of models in our model library. You can see the set of models in each category by clicking on the corresponding badge.

Evaluating a new model is as simple as taking a Python script that instantiates and invokes a PyTorch torch.nn.module and call turnkey on it. Read about model contributions here.

Plugins

The build tool has built-in support for a variety of export and optimization tools (e.g., the PyTorch-to-ONNX exporter, ONNX ML Tools fp16 converter, etc.). Likewise, the runtime tool comes out-of-box with support for x86 and Nvidia devices, along with ONNX Runtime, TensorRT, torch-eager, and torch-compiled runtimes.

If you need more, the TurnkeyML plugin API lets you extend the build and runtime tools with any functionality you like:

> pip install -e my_custom_plugin
> turnkey my_model.py --sequence my-custom-sequence --device my-custom-device --runtime my-custom-runtime --rt-args my-custom-args

All of the built-in sequences, runtimes, and devices are implemented against the plugin API. Check out the example plugins and the plugin API guide.

Contributing

We are actively seeking collaborators from across the industry. If you would like to contribute to this project, please check out our contribution guide.

Maintainers

This project is sponsored by the ONNX Model Zoo special interest group (SIG). It is maintained by @danielholanda @jeremyfowers @ramkrishna @vgodsoe in equal measure. You can reach us by filing an issue.

License

This project is licensed under the Apache 2.0 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

turnkeyml-1.1.0.tar.gz (134.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

turnkeyml-1.1.0-py3-none-any.whl (661.5 kB view details)

Uploaded Python 3

File details

Details for the file turnkeyml-1.1.0.tar.gz.

File metadata

  • Download URL: turnkeyml-1.1.0.tar.gz
  • Upload date:
  • Size: 134.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for turnkeyml-1.1.0.tar.gz
Algorithm Hash digest
SHA256 bf1d7a05061228bec6ed664e2003786d83cc788437883d1c834fa32108f21c2a
MD5 b8036009411373422aec65f7c04365ef
BLAKE2b-256 92529eec875c782266e2919580959ca0079ec4fd7e8828be361bf0d0a546c99e

See more details on using hashes here.

File details

Details for the file turnkeyml-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: turnkeyml-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 661.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for turnkeyml-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e54026eba7165db96a7e5b4f01e79036ee76df39d2dc8c98ff71288e84ad855b
MD5 bd20995f8a986011f644ecf873a9bafa
BLAKE2b-256 f523c5eb289aa52d5960f3323c1dfe85852eaff2dd40f89cd17e5ecddb6f6fe5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page