Airflow provider for Versatile Data Kit.
Project description
Versatile Data Kit Airflow provider
A set of Airflow operators, sensors and a connection hook intended to help schedule Versatile Data Kit jobs using Apache Airflow.
Usage
To install it simply run:
pip install airflow-provider-vdk
Then you can create a workflow of data jobs (deployed by VDK Control Service) like this:
from datetime import datetime
from airflow import DAG
from vdk_provider.operators.vdk import VDKOperator
with DAG(
"airflow_example_vdk",
schedule_interval=None,
start_date=datetime(2022, 1, 1),
catchup=False,
tags=["example", "vdk"],
) as dag:
trino_job1 = VDKOperator(
conn_id="vdk-default",
job_name="airflow-trino-job1",
team_name="taurus",
task_id="trino-job1",
)
trino_job2 = VDKOperator(
conn_id="vdk-default",
job_name="airflow-trino-job2",
team_name="taurus",
task_id="trino-job2",
)
transform_job = VDKOperator(
conn_id="vdk-default",
job_name="airflow-transform-job",
team_name="taurus",
task_id="transform-job",
)
[trino_job1, trino_job2] >> transform_job
Example
Demo
You can see demo during one of the community meetings here: https://www.youtube.com/watch?v=c3j1aOALjVU&t=690s
Architecture
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Close
Hashes for airflow-provider-vdk-0.0.870487610.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | fe9fbed93f25f522fe6e4b460bfd966d468ea10d090c69c02303e369c8529bef |
|
MD5 | a72bd2bd57ee40fe9a272a131397e210 |
|
BLAKE2b-256 | f30b264bc2444105d392e9d02d944168fafaa41d6fabd6ff15452852028e10eb |