A Graph-based Program Representation for Data Flow Analysis and Compiler Optimizations
Project description
ProGraML: Program Graphs for Machine Learning
An expressive, language-independent representation of programs.
Check the website for more information.
Introduction
ProGraML is a representation for programs as input to a machine learning model. The key features are:
-
Simple: Everything is available through a
pip install
, no compilation required. Supports several programming languages (C, C++, LLVM-IR, XLA) and several graph formats (NetworkX, DGL, Graphviz, JSON) out of the box. -
Expressive: Captures every control, data, and call relation across entire programs. The representation is independent of the source language. Features and labels can be added at any granularity to support whole-program, per-instruction, or per-relation reasoning tasks.
-
Fast: The core graph construction is implemented in C++ with a low overhead interface to Python. Every API method supports simple and efficient parallelization through an
executor
parameter.
To get stuck in and play around with our graph representation, visit:
Or if papers are more your ☕, have a read of ours:
Getting Started
Install the latest release of the Python package using:
pip install -U programl
The API is very simple, comprising graph creation ops, graph transform ops, and graph serialization ops. Here is a quick demo of each:
>>> import programl as pg
# Construct a program graph from C++:
>>> G = pg.from_cpp("""
... #include <iostream>
...
... int main(int argc, char** argv) {
... std::cout << "Hello, world!" << std::endl;
... return 0;
... }
... """)
# A program graph is a protocol buffer:
>>> type(G).__name__
'ProgramGraph'
# Convert the graph to NetworkX:
>>> pg.to_networkx(G)
<networkx.classes.multidigraph.MultiDiGraph at 0x7fbcf40a2fa0>
# Save the graph for later:
>>> pg.save_graphs('file.data', [G])
For further details check out the API reference.
Supported Programming Languages
The following programming languages and compiler IRs are supported out-of-the-box:
Language | API Calls | Supported Versions |
---|---|---|
C |
programl.from_cpp() ,
programl.from_clang()
|
Up to ISO C 2017 |
C++ |
programl.from_cpp() ,
programl.from_clang()
|
Up to ISO C++ 2020 DIS |
LLVM-IR |
programl.from_llvm_ir()
|
3.8.0, 6.0.0, 10.0.0 |
XLA |
programl.from_xla_hlo_proto()
|
2.0.0 |
Is your favorite language not supported here? Submit a feature request!
Contributing
Patches, bug reports, feature requests are welcome! Please use the issue tracker to file a bug report or question. If you would like to help out with the code, please read this document.
Citation
If you use ProGraML in any of your work, please cite this paper:
@inproceedings{cummins2021a,
title={{ProGraML: A Graph-based Program Representation for Data Flow Analysis and Compiler Optimizations}},
author={Cummins, Chris and Fisches, Zacharias and Ben-Nun, Tal and Hoefler, Torsten and O'Boyle, Michael and Leather, Hugh},
booktitle = {Thirty-eighth International Conference on Machine Learning (ICML)},
year={2021}
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for programl-0.3.0-py3-none-manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | af92373a230ca597262c77f67e72bbe3fdc5c1552dfaa3fffbc4e0b246159a35 |
|
MD5 | c4b05433fb4918a736bcd55f649010e4 |
|
BLAKE2b-256 | 648629f787e80a9122a18c77126023ec71adb1077411dfa940ccc5fda46427f5 |
Hashes for programl-0.3.0-py3-none-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1e432a0a9dbe075eb9660f35cc27e6da8303024388a95ecc358e963d1dac3949 |
|
MD5 | 1dbf407ba9f7b643a987934810536922 |
|
BLAKE2b-256 | 4de31a64f7867487b59b91ac2d2a6305859e8611f4122dd420702c794ee00696 |