Optimise TensorFlow Lite models for Ethos-U55 NPU.
In order to be accelerated by the Ethos-U NPU the network operators must be quantised to either 8-bit (unsigned or signed) or 16-bit (signed).
The optimised model will contain TensorFlow Lite Custom operators for those parts of the model that can be accelerated by the Ethos-U NPU. Parts of the model that cannot be accelerated are left unchanged and will instead run on the Cortex-M series CPU using an appropriate kernel (such as the Arm optimised CMSIS-NN kernels).
After compilation the optimised model can only be run on an Ethos-U NPU embedded system.
The tool will also generate performance estimates (EXPERIMENTAL) for the compiled model.
Vela supports TensorFlow 2.1.0 (for experimental Int16 support please use the latest nightly build of TensorFlow).
Vela runs on the Linux operating system.
The following should be installed prior to the installation of Vela:
- Python >= 3.6
- GNU toolchain (GCC, Binutils and libraries) or alternative C compiler/linker toolchain
- Pipenv virtual environment tool
Install Vela from PyPi using the following command:
pip3 install ethos-u-vela
First obtain the source code by either downloading the desired TGZ file from:
Or by cloning the git repository:
git clone https://review.mlplatform.org/ml/ethos-u/ethos-u-vela.git
Once you have the source code, Vela can be installed using the following command:
pip3 install -U setuptools>=40.1.0 pip3 install .
Or, if you use
pipenv install .
Advanced Installation for Developers
If you plan to modify the Vela codebase then it is recommended to install Vela
as an editable package to avoid the need to re-install after every modification.
This is done by adding the
-e option to the above install commands like so:
pip3 install -e .
Or, if you use
pipenv install -e .
If you plan to contribute to the Vela project (highly encouraged!) then it is recommended to install Vela along with the pre-commit tools (see Vela Testing for more details).
Vela is run with an input
.tflite file passed on the command line. This file
contains the neural network to be compiled. The tool then outputs an optimised
version with a
_vela.tflite file prefix, along with the performance estimate
(EXPERIMENTAL) CSV files, all to the output directory.
If you use the
pipenv virtual environment tool then first start by spawning a
shell in the virtual environment:
After which running Vela is the same regardless of whether you are in a virtual environment or not.
- Compile the network
my_model.tflite. The optimised version will be output to
- Compile the network
/path/to/my_model.tfliteand specify the output to go in the directory
vela --output-dir ./results_dir /path/to/my_model.tflite
- To specify information about the embedded system's configuration use Vela's
system configuration file. The following command selects the
MySysConfigsettings that are described in the
sys_cfg_vela.inisystem configuration file. More details can be found in the next section.
vela --config sys_cfg_vela.ini --system-config MySysConfig my_model.tflite
- To get a list of all available options:
Information about all of Vela's CLI options as well as the system configuration file format can be found in Vela Options.
Some example networks that contain quantised operators which can be compiled by Vela to run on the Ethos-U NPU can be found at: https://tfhub.dev/s?deployment-format=lite&q=quantized
Please see Vela Testing.
Please see Vela Contributions.
Please see Vela Security.
Please see Vela Releases.
Additional useful information:
Vela is licensed under Apache License 2.0.
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.