A to-SQL Compiler Prototype.
Project description
Flummi
A to-SQL Compiler Prototype.
Flummi is a research compiler prototype for to-SQL compilation, implementing methods we've researched and developed at the database chair of the University of Tübingen---see the list of related publications below.
Usage
The Flummi compiler in this repository can be used either as a python library or directly as a command line tool. Either way, to use our compiler, you will need to add it to your local environment. You can install it from PyPI or as a git dependency:
$ pip install flummi
$ pip install flummi@git+https://github.com/DBatUTuebingen/flummi
Library mode.
As a python library, we expose only the function compile and the module IR containing our intermediate representations. The compile function is straightforward; as the name implies it takes a program and compiles it, yielding the compiled query string. The programs can either be supplied in the form of a source string (you can find a description of the syntax down below) or an AST.
from flummi import compile
query = compile("""
{
DECLARE v : §int§;
LET v = §0§[];
EMIT v;
LET v = §{0} + 1§[v];
EMIT v;
STOP
}
""")
To map errors back to original source locations during exception handling, we track the provenance of every AST/IR node in a common attribute called location of type flummi.library.errors.Location | None. When present, we use the provenance information to render more helpful error messages, but we do not require it! To make the pretty error messages work when supplying compile with a custom AST, you need to add a proper code location to each AST node and also pass your source code using the source keyword argument.
from flummi import compile
from flummi.IR import AST
from flummi.library.errors import Location
source = "EMIT v"
ast = AST.Program(
AST.Emit(
AST.Variable("v", location=Location(line=1, column=6)),
location=Location(line=1, column=1),
),
location=Location(line=1, column=1),
)
query = compile(ast, source=source)
flummi.compiler.analysis.AnalysisError: AnalysisError:
Found read from uninitialised variable 'v'.
1 | EMIT v
---------^
Tool mode.
To use the Flummi compiler as a CLI tool, you will need to install it with the extra-dependencies cli, i.e., pip install flummi[cli]. When using the compiler via the CLI, you can pass it a Flummi source file which it will in turn compile and dump the compiled SQL query to stdout or an optional file.
-- contents of input.fl
{
DECLARE v : §int§;
LET v = §0§[];
EMIT v;
LET v = §{0} + 1§[v];
EMIT v;
STOP
}
$ flummi input.fl [output.sql]
WITH
"start.1"("#️⃣", "⚙️") AS (
SELECT 0 AS "#️⃣",
NULL AS "⚙️"
),
...
For debugging purposes, or if you just a bit nosy, the compiler can also spit out a graphically renditions of the IR used during our compilation. To render these we rely on being able to find a working version of graphviz somewhere on your path—so make sure to have that installed if you want to use this feature.
The Flummi Language
This compiler prototype uses a "minimum viable language" as an input, as such it isn't really ergonomic to program---but that isn't our goal here! Some of the "unergonomic" things you will need to contend with are that we only have infinite loops with loop controls (BREAK/CONTINUE), our conditionals always need two branches, all variables are scoped globally, etc. The following is a complete EBNF-esque grammar of the Flummi language.
𝑠 := { 𝑠; …; 𝑠 }
| DECLARE 𝑣 : 𝑡
| LET 𝑣 = 𝑒
| EMIT 𝑣
| STOP
| NOOP
| IF 𝑣 THEN 𝑠 ELSE 𝑠
| LOOP 𝑠
| BREAK
| CONTINUE
| FORK 𝑣 = 𝑞
| GATHER 𝑣 = 𝑎 [ BY 𝑣, …, 𝑣 ]
| SYNC [ BY 𝑣, …, 𝑣 ]
𝑣 := <variable>
𝑡 := §<SQL type>§
𝑒 := §<scalar SQL expression>§[<free variables>]
𝑞 := §<non-scalar SQL expression>§[<free variables>]
𝑎 := §<SQL aggregate expression>§[<free variables>]
Embedded SQL
The grammar contains a few §...§[...] bits that we call "black boxes" in most of our work. Syntactically we represent these in two parts, the first §...§ being a bit of SQL code with free variables (free wrt. the code snippet itself) replaced by {𝑖} and the second being a list of the free variables in the order of 𝑖. So the expression §{0} + 1§[v] maps to the SQL expression that increments the variable v, i.e., v + 1. Note that for types we don't expect free variables, since that doesn't make much sense for our purposes.
Data-Driven Concurrency
Up until LOOP, BREAK, CONTINUE, the grammar above probably doesn't need much explaining. After that we get a triplet of statements that you may not be able to make much sense of. TL;DR together, FORK, GATHER, and SYNC represent a data-driven version of the ubiquitous fork-join model for concurrency enabled by our compilation method. The work on this has yet to be published, so you won't find much information on this feature elsewhere. As a brief starter, consider the following:
FORK …allows you to fork the current program state into multiple sibling program states at any given time for each row in the result of a table-valued SQL expression.GATHER … [BY …]allows you to join (bad overlap in DB-lingo here, so we call it gather) sibling program states together by combining data from each of them using a SQL aggregate expression. Further you can group sibling program states by providing an additional grouping key.SYNC [BY …]allows you to enforce synchronicity between sibling program states, again with the option to group sibling program states using a grouping key.
The latter is necessary due to our handling of loops during compilation; the SQL queries we compile to execute loop iterations of all siblings in lock step, which can lead to situations where siblings diverge wrt. their current progress through the program. Without the SYNC statement, such divergence would be unrecoverable, thus making it impossible to join these divergent sibling program states.
Related Publications
- Tim Fischer and Denis Hirn. 2025. BIRNE: Mixed-paradigm Workload Execution in SQL Engines. In Proceedings of the 19th International Symposium on Database Programming Languages (DBPL '25). Association for Computing Machinery, New York, NY, USA, Article 4, 1–11. https://doi.org/10.1145/3735106.3736535
- Tim Fischer, Denis Hirn, and Torsten Grust. 2024. SQL Engines Excel at the Execution of Imperative Programs. Proc. VLDB Endow. 17, 13 (September 2024), 4696–4708. https://doi.org/10.14778/3704965.3704976
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file flummi-0.2.1.tar.gz.
File metadata
- Download URL: flummi-0.2.1.tar.gz
- Upload date:
- Size: 35.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9725434b0a3092c49b329ce97c668e563174a52ec68dda2a645e176e1d97f1bb
|
|
| MD5 |
f00fc4732755e9fcf058f1df1d9e5d70
|
|
| BLAKE2b-256 |
01f768fe9002648f25c06a792393b1822c263d0af95d556d41aeae6ac648b2fd
|
File details
Details for the file flummi-0.2.1-py3-none-any.whl.
File metadata
- Download URL: flummi-0.2.1-py3-none-any.whl
- Upload date:
- Size: 39.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1355b245c527f723941bfcfa45aafa3b2afff0b167769caa4335edbb38790458
|
|
| MD5 |
352a5be8ba936f9385e3863663aa4104
|
|
| BLAKE2b-256 |
f7a3e8c9fd9c7b3d6cce505e766ee014d5f54a7a2668624a7763c8154e0a93cf
|