No project description provided
Project description
Harmonizer
Harmonizer is a library we have developed at Dune to translate Dune queries from PostgreSQL and Spark SQL to DuneSQL. We currently use this library in our migration service in the app.
A query is translated through two steps:
- We use SQLGlot to transpile the query. This is an excellent tool which parses a SQL query into an Abstract Syntax Tree (AST), and then translates it to a different dialect. We use it to translate from Spark SQL to DuneSQL, and from PostgreSQL to DuneSQL.
- We pass the query through custom rules to make additional changes to the query. Examples of such rules are
- mapping known changes in table names from the legacy Postgres datasets to corresponding table names in DuneSQL
- translating string literals '0x...' to 0x... in DuneSQL, since we support native hex literals.
Getting started
Install with
pip install dune-harmonizer
Now import the migrate_
functions in your code:
from dune.harmonizer import translate_spark, translate_postgres
with function signatures
def translate_spark(query: str) -> str:
...
def translate_postgres(query: str, dataset: str) -> str:
...
Contributing
Contributions are very welcome!
Please open an issue, and we will get back to you as soon as we can.
Development
Install with
poetry install
If the Ruff linter complains, running the following and committing the changes should suffice
poetry run ruff . --fix
poetry run black .
Run tests with
poetry run pytest
We test on examples in the test_cases
directory.
To force an update of the expected outputs, run the update_expected_outputs
script like below
poetry run python tests/update_expected_outputs.py
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for dune_harmonizer-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5eb56927a01709e0cec8b2c95d11c15d03b03d0af14103fc698f9e5e05dcc9fa |
|
MD5 | e2f06fae506f45aec3d3c56a717c8888 |
|
BLAKE2b-256 | ebccd13bc43266d554d40afcf337b501242b15533e8f0b279bdaa9ec17787d13 |