The smallest DuckDB SQL transformations orchestrator
Project description
yato — yet another transformation orchestrator
yato is the smallest orchestrator on Earth to orchestrate SQL data transformations on top of DuckDB. You just give a folder with SQL queries and it guesses the DAG and runs the queries in the right order.
Installation
yato works with Python 3.8+.
pip install yato-orchestrator
Get Started
Create a folder named sql
and put your SQL files in it, you can for instance uses the 2 queries given in the example folder.
from yato import Yato
yato = Yato(
# The path of the file in which yato will run the SQL queries.
# If you want to run it in memory, just set it to :memory:
database_path="tmp.duckdb",
# This is the folder where the SQL files are located.
# The names of the files will determine the name of the table created.
sql_folder="sql/",
# The name of the DuckDB schema where the tables will be created.
schema="transform",
)
# Runs yato against the DuckDB database with the queries in order.
yato.run()
You can also run yato with the cli:
yato run --db tmp.duckdb sql/
Works with dlt
yato is designed to work in pair with dlt. dlt handles the data loading and yato the data transformation.
import dlt
from yato import Yato
yato = Yato(
database_path="db.duckdb",
sql_folder="sql/",
schema="transform",
)
# You restore the database from S3 before runnning dlt
yato.restore()
pipeline = dlt.pipeline(
pipeline_name="get_my_data",
destination="duckdb",
dataset_name="production",
credentials="db.duckdb",
)
data = my_source()
load_info = pipeline.run(data)
# You backup the database after a successful dlt run
yato.backup()
yato.run()
How does it work?
yato runs relies on the amazing SQLGlot library to syntactically parse the SQL queries and build a DAG of the dependencies. Then, it runs the queries in the right order.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for yato_lib-0.0.1-1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 129014c2a1597bd380c2c091fbdd0b4544b643b4782893f177c938b951edf62f |
|
MD5 | 154ca32722d8459cf5bd4b778f1d8220 |
|
BLAKE2b-256 | 09c22c76bca157c3e70d03edbf5ca22cb95752126eb7502969e7e000e620065a |