Runtime specialization and JIT compilation built on LLVM
Project description
Proteus
Proteus is a programmable runtime specialization and Just-In-Time (JIT) layer built on LLVM. It embeds into existing C++ codebases and accelerates host, CUDA, and HIP applications by using runtime context to specialize code and enable optimizations beyond static compilation.
Description
Standard ahead-of-time (AOT) compilation can only optimize a program with the information available at build time. Proteus goes further by embedding optimizing JIT compilation directly into C/C++ applications.
Runtime context, such as the actual values of variables during execution, lets it specialize code on the fly and apply advanced compiler optimizations that accelerate performance beyond what static compilation allows.
Several frontends are available, depending on how you want to describe JIT code:
| Interface | Input style | Best for | Specialization model | Requires Clang AOT? |
|---|---|---|---|---|
| Code annotations | Existing C/C++/CUDA/HIP code | Incremental adoption in existing applications | Values, arrays, objects, and launch configuration | Yes |
| C++ frontend API | C++ source strings | Runtime-generated C++ and templates | Values, arrays, objects, and launch configuration | No |
| LLVM IR frontend API | LLVM IR text or bitcode | Reusing externally generated LLVM IR with Proteus caching and dispatch | Encoded in the provided LLVM IR | No |
| MLIR frontend API | MLIR source strings | Direct access to MLIR lowering | Encoded in the provided MLIR source | No |
| Embedded DSL API | Programmatic builders | Runtime code generation with high-level constructs | Values, arrays, and launch configuration | No |
These frontends can target host, CUDA, and HIP execution paths, with backend support depending on how Proteus was configured:
| Interface | Host | CUDA | HIP | Notes |
|---|---|---|---|---|
| Code annotations | Yes | Yes | Yes | Requires compiling with Clang and uses the Proteus LLVM pass |
| C++ frontend API | Yes | Yes | Yes | Uses Clang by default; CUDA paths can use NVCC |
| LLVM IR frontend API | Yes | Yes | Yes | Accepts LLVM IR text or bitcode and compiles it directly through LLVM |
| MLIR frontend API | Yes | Yes | Yes | Requires PROTEUS_ENABLE_MLIR=ON |
| Embedded DSL API | Yes | Yes | Yes | Uses the LLVM backend by default; MLIR backend requires PROTEUS_ENABLE_MLIR=ON |
CUDA, HIP, and MLIR support are available when Proteus is built with the corresponding configuration options enabled.
Proteus includes both in-memory and persistent caching, ensuring that once code has been compiled and optimized, the cost of recompilation is avoided.
Proteus consists of an LLVM pass and a runtime library that implements JIT compilation and optimization using LLVM as a library.
- The code annotation interface requires compiling your application with Clang so the Proteus LLVM pass can parse annotations.
- The DSL, C++ frontend, LLVM IR frontend, and MLIR frontend APIs don’t depend on which AOT compiler you use.
In all cases, you link your application against the Proteus runtime library. Details are provided later.
Installation
Python API users can install proteus-python from wheels. C++ users should
install Proteus from source or via spack.
Spack
We provide a packaging recipe for Spack in the subdirectory packaging/spack.
Assuming you have a Spack installation and preferably using an isolated Spack environment, you can add the spack repo by cloning Proteus and then install it by running:
git clone https://github.com/Olympus-HPC/proteus.git
spack repo add proteus/packaging/spack
spack install proteus
We provide several variants to match different configurations, including CUDA, ROCm, and MPI support. A complete list of variants and their descriptions is available in the Spack package file, or viewable through:
spack info proteus
Some typical examples:
# Install the latest version with CUDA support for sm_90 arch.
spack install proteus +cuda cuda_arch=90
# Install the latest version with ROCm support for gfx942 arch.
spack install proteus +rocm amdgpu_target=gfx942
# Install the latest version with MPI support.
spack install proteus +mpi
Building from source
The project uses cmake and requires an LLVM installation.
CI tests currently cover LLVM 19, 20, 22 with CUDA versions 12.2, and AMD
ROCm versions 6.4.3 (based on LLVM 19), 7.1.1 (based on LLVM 20), 7.2.0 (based
on LLVM 22).
See the top-level CMakeLists.txt for the available build options.
A typical build looks like this:
mkdir -p build && cd build
cmake -DLLVM_INSTALL_DIR=<llvm_install_path> -DCMAKE_INSTALL_PREFIX=<install_path> ..
make install
The scripts directory contains setup scripts for building on different targets
(host-only, CUDA, ROCm) used on LLNL machines.
They also serve as good starting points to adapt for other environments.
Run them from the repository root:
source scripts/setup-<target>.sh
These scripts load environment modules (specific to LLNL systems) and create a
build-<hostname>-<target>-<version> directory with a
working configuration.
Python wheels
Proteus now publishes a thin proteus-python shim package plus backend wheels.
The default install is shim-only from PyPI. Install an explicit backend from
the Olympus-HPC wheel index.
The shim package provides the Python import surface and backend discovery. The native payload lives in backend-specific wheels published outside PyPI.
| Install | Backend | Target | Required compiler/toolchain |
|---|---|---|---|
pip install proteus-python |
shim only | Python API only | none |
pip install --index-url https://olympus-hpc.github.io/proteus/wheels/simple/ proteus-python-backend-host-llvm22 |
proteus-python-backend-host-llvm22 |
Host CPU | LLVM/Clang 22.x |
pip install --index-url https://olympus-hpc.github.io/proteus/wheels/simple/ proteus-python-backend-cuda12-llvm22 |
proteus-python-backend-cuda12-llvm22 |
Host CPU + NVIDIA CUDA GPU | CUDA 12.x plus LLVM/Clang 22.x |
pip install --index-url https://olympus-hpc.github.io/proteus/wheels/simple/ proteus-python-backend-rocm72 |
proteus-python-backend-rocm72 |
Host CPU + AMD ROCm GPU | ROCm 7.2.x |
Typical stable installs:
python -m pip install proteus-python
python -m pip install --index-url https://olympus-hpc.github.io/proteus/wheels/simple/ \
proteus-python-backend-host-llvm22
python -m pip install proteus-python
python -m pip install --index-url https://olympus-hpc.github.io/proteus/wheels/simple/ \
proteus-python-backend-cuda12-llvm22
python -m pip install proteus-python
python -m pip install --index-url https://olympus-hpc.github.io/proteus/wheels/simple/ \
proteus-python-backend-rocm72
See docs/dev/python-wheel.md for packaging and release details.
Integrating with your build system
CMake
To integrate Proteus with CMake, add the install prefix to CMAKE_PREFIX_PATH,
or pass it explicitly with
-Dproteus_DIR=<install_path>/<libdir>/cmake/proteus during configuration
where <libdir> is typically lib or lib64.
Then, in your project's CMakeLists.txt add:
find_package(proteus CONFIG REQUIRED)
add_proteus(<target>)
If you only need the DSL, C++ frontend, LLVM IR frontend, or MLIR frontend APIs, you can link directly against
proteusFrontend.
In this case, you don’t need to compile your target with Clang:
find_package(proteus CONFIG REQUIRED)
target_link_libraries(<target> ... proteusFrontend ...)
Make
With make, annotation-based integration requires adding compilation and
linking flags, for example:
CXXFLAGS += -I<install_path>/include -fpass-plugin=<install_path>/<libdir>/libProteusPass.so
LDFLAGS += -L<install_path>/<libdir> -Wl,-rpath,<install_path>/<libdir> -lproteus $(llvm-config --libs) -lclang-cpp
If you don't use code annotations, you can omit the -fpass-plugin option,
since the LLVM pass is only needed for processing annotations.
Using
Proteus's core optimization technique is runtime constant folding.
It replaces runtime values with constants during JIT compilation, which in turn
turbo-charges classical compiler optimizations such as loop unrolling,
control-flow simplification, and constant propagation.
Think of it as doing constexpr, but at runtime.
Values that can be folded include function or kernel arguments, kernel launch dimensions, launch bounds, and other runtime variables.
Choose the interface that matches how your application wants to describe JIT work:
| Interface | Detailed guide |
|---|---|
| Code annotations | Code Annotations |
| C++ frontend API | C++ Frontend API |
| LLVM IR frontend API | LLVM IR Frontend API |
| MLIR frontend API | MLIR Frontend API |
| DSL API | DSL API |
Proteus generates a unique specialization for each distinct set of runtime values and caches them in memory and on disk, so JIT overhead is minimized within and across runs.
Documentation
The Proteus documentation has more extensive information, including a user's guide and developer manual.
Contributing
We welcome contributions to Proteus in the form of pull requests targeting the
main branch of the repo, as well as questions, feature requests, or bug reports
via issues.
Code of Conduct
Please note that Proteus has a Code of Conduct. By participating in the Proteus community, you agree to abide by its rules.
Authors
Proteus was created by Giorgis Georgakoudis, georgakoudis1@llnl.gov.
Key contributors are:
- David Beckingsale, beckingsale1@llnl.gov
- Konstantinos Parasyris, parasyris1@llnl.gov
- John Bowen, bowen36@llnl.gov
- Zane Fink, fink12@llnl.gov
- Tal Ben Nun, bennun2@llnl.gov
- Thomas Stitt, stitt4@llnl.gov
License
Proteus is distributed under the terms of the Apache License (Version 2.0) with LLVM Exceptions.
All new contributions must be made under the Apache-2.0 with LLVM Exceptions license.
See LICENSE, COPYRIGHT, and NOTICE for details.
SPDX-License-Identifier: (Apache-2.0 WITH LLVM-exception)
LLNL-CODE-2000857
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file proteus_python-2026.7.0.tar.gz.
File metadata
- Download URL: proteus_python-2026.7.0.tar.gz
- Upload date:
- Size: 776.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd64c3bac64bcce771c0d1e31fe30715c854116da0170690c2c14fb03990dd97
|
|
| MD5 |
ae28102911103362fb7ef634a2d15e08
|
|
| BLAKE2b-256 |
89ee2175187b79e27188b9237fc63e35672ca1de15f70530de53798a675d94e6
|
Provenance
The following attestation bundles were made for proteus_python-2026.7.0.tar.gz:
Publisher:
ci-wheels.yml on Olympus-HPC/proteus
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
proteus_python-2026.7.0.tar.gz -
Subject digest:
bd64c3bac64bcce771c0d1e31fe30715c854116da0170690c2c14fb03990dd97 - Sigstore transparency entry: 2043615223
- Sigstore integration time:
-
Permalink:
Olympus-HPC/proteus@4c5d47f27377faa19f5a8e8eb670d5b905f96b38 -
Branch / Tag:
refs/tags/v2026.07.0 - Owner: https://github.com/Olympus-HPC
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci-wheels.yml@4c5d47f27377faa19f5a8e8eb670d5b905f96b38 -
Trigger Event:
release
-
Statement type:
File details
Details for the file proteus_python-2026.7.0-py3-none-any.whl.
File metadata
- Download URL: proteus_python-2026.7.0-py3-none-any.whl
- Upload date:
- Size: 13.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a617652e1b361fae91cc1f49f61181e7af6c08d9f66fd9f3bd2f728a7d0c1c52
|
|
| MD5 |
afc684494075a8ed4fd2cdcbe90da05f
|
|
| BLAKE2b-256 |
557c4124f18380f392d4855b88aa4bd59b8e36e5d08a91430f025506d56a2bef
|
Provenance
The following attestation bundles were made for proteus_python-2026.7.0-py3-none-any.whl:
Publisher:
ci-wheels.yml on Olympus-HPC/proteus
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
proteus_python-2026.7.0-py3-none-any.whl -
Subject digest:
a617652e1b361fae91cc1f49f61181e7af6c08d9f66fd9f3bd2f728a7d0c1c52 - Sigstore transparency entry: 2043615236
- Sigstore integration time:
-
Permalink:
Olympus-HPC/proteus@4c5d47f27377faa19f5a8e8eb670d5b905f96b38 -
Branch / Tag:
refs/tags/v2026.07.0 - Owner: https://github.com/Olympus-HPC
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci-wheels.yml@4c5d47f27377faa19f5a8e8eb670d5b905f96b38 -
Trigger Event:
release
-
Statement type: