Skip to main content

Highly optimized inference engine for binarized neural networks.

Project description

Larq Compute Engine

Larq Compute Engine (LCE) is a highly optimized inference engine for deploying extremely quantized neural networks, such as Binarized Neural Networks (BNNs). It currently supports various mobile platforms and has been benchmarked on a Pixel 1 phone and a Raspberry Pi. LCE provides a collection of hand-optimized TensorFlow Lite custom Ops for supported instruction sets, developed in inline assembly or in C++ using compiler intrinsics. LCE leverages optimization techniques such as tiling to maximize the number of cache hits, vectorization to maximize the computational throughput, and multi-threading parallelization to take advantage of multi-core modern desktop and mobile CPUs.

Key Features

  • Effortless end-to-end integration from training to deployment:

    • Tight integration of LCE with Larq and TensorFlow provides a smooth end-to-end training and deployment experience.

    • A collection of Larq pre-trained BNN models for common machine learning tasks is available in Larq Zoo and can be used out-of-the-box with LCE.

    • LCE provides a custom MLIR-based model converter which is fully compatible with TensorFlow Lite and performs additional network level optimizations for Larq models.

  • Lightning fast deployment on a variety of mobile platforms:

    • LCE enables high performance, on-device machine learning inference by providing hand-optimized kernels and network level optimizations for BNN models.

    • LCE currently supports ARM64-based mobile platforms such as Android phones and Raspberry Pi boards.

    • Thread parallelism support in LCE is essential for modern mobile devices with multi-core CPUs.

Performance

The table below presents single-threaded performance of Larq Compute Engine on multiple generations of Larq BNN models on the Pixel phone (2016) and (Raspberry Pi 4 BCM2711) board:

Model Accuracy Pixel, ms RPi 4 (BCM2711), ms
TODO TODO TODO TODO
TODO TODO TODO TODO
TODO TODO TODO TODO
TODO TODO TODO TODO

The following table presents multi-threaded performance of Larq Compute Engine on a Pixel 1 phone and a Raspberry Pi 4 board:

Model Accuracy Pixel, ms RPi 4 (BCM2711), ms
TODO TODO TODO TODO
TODO TODO TODO TODO
TODO TODO TODO TODO
TODO TODO TODO TODO

Benchmarked on February, TODO with LCE custom TFLite Model Benchmark Tool (see here) and BNN models with randomized weights and inputs.

Getting started

Follow these steps to deploy a BNN with LCE:

  1. Pick a Larq model

    You can use Larq to build and train your own model or pick a pre-trained model from Larq Zoo.

  2. Convert the Larq model

    LCE is built on top of TensorFlow Lite and uses TensorFlow Lite FlatBuffer format to convert and serialize Larq models for inference. We provide a LCE Converter with additional optimization passes to increase the speed of execution of Larq models on supported target platforms.

  3. Build LCE

    The LCE documentation provides the build instructions for Android and ARM64-based boards such as Raspberry Pi. Please follow the provided instructions to create a native LCE build or cross-compile for one of the supported targets.

  4. Run inference

    LCE uses the TensorFlow Lite Interpreter to perform an inference. In addition to the already available built-in TensorFlow Lite Ops, optimized LCE Ops are registered to the interpreter to execute the Larq specific subgraphs of the model. An example to create and build LCE compatible TensorFlow Lite interpreter in user's applications is provided here.

Next steps

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

larq_compute_engine-0.1.0rc1-cp37-cp37m-manylinux2010_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.7mmanylinux: glibc 2.12+ x86-64

larq_compute_engine-0.1.0rc1-cp37-cp37m-macosx_10_15_x86_64.whl (8.6 MB view details)

Uploaded CPython 3.7mmacOS 10.15+ x86-64

larq_compute_engine-0.1.0rc1-cp36-cp36m-manylinux2010_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.6mmanylinux: glibc 2.12+ x86-64

larq_compute_engine-0.1.0rc1-cp36-cp36m-macosx_10_15_x86_64.whl (8.6 MB view details)

Uploaded CPython 3.6mmacOS 10.15+ x86-64

File details

Details for the file larq_compute_engine-0.1.0rc1-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: larq_compute_engine-0.1.0rc1-cp37-cp37m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.7m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for larq_compute_engine-0.1.0rc1-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 046462dfa451f62a74dcc4956e92700f658ae036cd0d7cea736e20f8fbda82c0
MD5 6b3352651dcc099ffe64e6d5af65aad9
BLAKE2b-256 460a36880089a7335708fcf1d32ea73367a81127139cabc52d3035dc4ff83c9d

See more details on using hashes here.

File details

Details for the file larq_compute_engine-0.1.0rc1-cp37-cp37m-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: larq_compute_engine-0.1.0rc1-cp37-cp37m-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 8.6 MB
  • Tags: CPython 3.7m, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for larq_compute_engine-0.1.0rc1-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 a2be91154ee76c7027960f4fb26bb0ad2d5262139bf000723a385ef2177ce406
MD5 23c5b7f61a0873a72776ca3781907173
BLAKE2b-256 2fc39d657f2a0a233903e29e38df9d8599249bfbcae076dcabdafdfc225beb26

See more details on using hashes here.

File details

Details for the file larq_compute_engine-0.1.0rc1-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: larq_compute_engine-0.1.0rc1-cp36-cp36m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 5.2 MB
  • Tags: CPython 3.6m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for larq_compute_engine-0.1.0rc1-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 77d49f189e3797e22533afdec3fa538730b6786831b5d86cd457f5fe27c9052d
MD5 bded4ed2481acfdca37f5e641eb2211e
BLAKE2b-256 194093dbac2f2b26d149fe43af46593532c4878650c84c0a9c79b0f1ff8de470

See more details on using hashes here.

File details

Details for the file larq_compute_engine-0.1.0rc1-cp36-cp36m-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: larq_compute_engine-0.1.0rc1-cp36-cp36m-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 8.6 MB
  • Tags: CPython 3.6m, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for larq_compute_engine-0.1.0rc1-cp36-cp36m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 3b8b777a9937d002b5b95f5c7c8b09c6a6b56728dd3c1392953e055aa05954e2
MD5 27588f49f0ce8797cc04656cf359efff
BLAKE2b-256 e3bf9c6457a38bc3a7fbbd7f86ef5d2421514a7c0da7f7d092131189b0bbfa8a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page