Skip to main content

Tooling for ML in LLVM

Project description

Infrastructure for MLGO - a Machine Learning Guided Compiler Optimizations Framework.

MLGO is a framework for integrating ML techniques systematically in LLVM. It replaces human-crafted optimization heuristics in LLVM with machine learned models. The MLGO framework currently supports two optimizations:

  1. inlining-for-size(LLVM RFC);
  2. register-allocation-for-performance(LLVM RFC)

The compiler components are both available in the main LLVM repository. This repository contains the training infrastructure and related tools for MLGO.

We currently use two different ML algorithms: Policy Gradient and Evolution Strategies to train policies. Currently, this repository only support Policy Gradient training. The release of Evolution Strategies training is on our roadmap.

Check out this demo for an end-to-end demonstration of how to train your own inlining-for-size policy from the scratch with Policy Gradient, or check out this demo for a demonstration of how to train your own regalloc-for-performance policy.

For more details about MLGO, please refer to our paper MLGO: a Machine Learning Guided Compiler Optimizations Framework.

For more details about how to contribute to the project, please refer to contributions.

Pretrained models

We occasionally release pretrained models that may be used as-is with LLVM. Models are released as github releases, and are named as [task]-[major-version].[minor-version].The versions are semantic: the major version corresponds to breaking changes on the LLVM/compiler side, and the minor version corresponds to model updates that are independent of the compiler.

When building LLVM, there is a flag -DLLVM_INLINER_MODEL_PATH which you may set to the path to your inlining model. If the path is set to download, then cmake will download the most recent (compatible) model from github to use. Other values for the flag could be:

# Model is in /tmp/model, i.e. there is a file /tmp/model/saved_model.pb along
# with the rest of the tensorflow saved_model files produced from training.
-DLLVM_INLINER_MODEL_PATH=/tmp/model

# Download the most recent compatible model
-DLLVM_INLINER_MODEL_PATH=download

Prerequisites

Currently, the assumptions for the system are:

  • Recent Ubuntu distro, e.g. 20.04
  • python 3.8.x/3.9.x/3.10.x
  • for local training, which is currently the only supported mode, we recommend a high-performance workstation (e.g. 96 hardware threads).

Training assumes a clang build with ML 'development-mode'. Please refer to:

The model training - specific prerequisites are:

Pipenv:

pip3 install pipenv

The actual dependencies:

pipenv sync --system

Note that the above command will only work from the root of the repository since it needs to have Pipfile.lock in the working directory at the time of execution.

If you plan on doing development work, make sure you grab the development and CI categories of packages as well:

pipenv sync --system --categories "dev-packages ci"

Optionally, to run tests (run_tests.sh), you also need:

sudo apt-get install virtualenv

Note that the same tensorflow package is also needed for building the 'release' mode for LLVM.

Docs

An end-to-end demo using Fuchsia as a codebase from which we extract a corpus and train a model.

How to add a feature guide. Extensibility model.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ml-compiler-opt-0.0.1.dev202311260007.tar.gz (147.8 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file ml-compiler-opt-0.0.1.dev202311260007.tar.gz.

File metadata

File hashes

Hashes for ml-compiler-opt-0.0.1.dev202311260007.tar.gz
Algorithm Hash digest
SHA256 2c035ecccf2f3ae57d6d0713d389cfd00ad08dd21bbe3e69fa03c5b3bb07e1e3
MD5 cfd2534d2274550e03fef575985ccacb
BLAKE2b-256 e7683daa0b2e52e438fbf9d5ba1a760eea615745474dbccacd24fa35390cf87c

See more details on using hashes here.

File details

Details for the file ml_compiler_opt-0.0.1.dev202311260007-py3-none-any.whl.

File metadata

File hashes

Hashes for ml_compiler_opt-0.0.1.dev202311260007-py3-none-any.whl
Algorithm Hash digest
SHA256 9d5ac9278113d75fe806c2e20c7ab39a58507f5d580c30acb2dfe15893fc00d1
MD5 e9b6205bf1b5b4bde24ef9c163e9ff56
BLAKE2b-256 80ba592c2ad9e99b5d8bb41723bdedce12515ec2e93b8cb3d0e59b335b67aec2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page