Dynamically build Airflow DAGs from YAML files
Project description
dag-factory
dag-factory is a library for dynamically generating Apache Airflow DAGs from YAML configuration files.
Installation
To install dag-factory run pip install dag-factory
. It requires Python 3.6.0+ and Apache Airflow 1.10+.
Usage
After installing dag-factory in your Airflow environment, there are two steps to creating DAGs. First, we need to create a YAML configuration file. For example:
example_dag1:
default_args:
owner: 'example_owner'
start_date: 2018-01-01 # or '2 days'
end_date: 2018-01-05
retries: 1
retry_delay_sec: 300
schedule_interval: '0 3 * * *'
concurrency: 1
max_active_runs: 1
dagrun_timeout_sec: 60
default_view: 'tree' # or 'graph', 'duration', 'gantt', 'landing_times'
orientation: 'LR' # or 'TB', 'RL', 'BT'
description: 'this is an example dag!'
on_success_callback_name: print_hello
on_success_callback_file: /usr/local/airflow/dags/print_hello.py
on_failure_callback_name: print_hello
on_failure_callback_file: /usr/local/airflow/dags/print_hello.py
tasks:
task_1:
operator: airflow.operators.bash_operator.BashOperator
bash_command: 'echo 1'
task_2:
operator: airflow.operators.bash_operator.BashOperator
bash_command: 'echo 2'
dependencies: [task_1]
task_3:
operator: airflow.operators.bash_operator.BashOperator
bash_command: 'echo 3'
dependencies: [task_1]
Then in the DAGs folder in your Airflow environment you need to create a python file like this:
from airflow import DAG
import dagfactory
dag_factory = dagfactory.DagFactory("/path/to/dags/config_file.yml")
dag_factory.clean_dags(globals())
dag_factory.generate_dags(globals())
And this DAG will be generated and ready to run in Airflow!
Notes
HttpSensor (since 0.10.0)
The package airflow.sensors.http_sensor
works with all supported versions of Airflow. In Airflow 2.0+, the new package name can be used in the operator value: airflow.providers.http.sensors.http
The following example shows response_check
logic in a python file:
task_2:
operator: airflow.sensors.http_sensor.HttpSensor
http_conn_id: 'test-http'
method: 'GET'
response_check_name: check_sensor
response_check_file: /path/to/example1/http_conn.py
dependencies: [task_1]
The response_check
logic can also be provided as a lambda:
task_2:
operator: airflow.sensors.http_sensor.HttpSensor
http_conn_id: 'test-http'
method: 'GET'
response_check_lambda: 'lambda response: "ok" in reponse.text'
dependencies: [task_1]
Benefits
- Construct DAGs without knowing Python
- Construct DAGs without learning Airflow primitives
- Avoid duplicative code
- Everyone loves YAML! ;)
Contributing
Contributions are welcome! Just submit a Pull Request or Github Issue.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for dag_factory-0.15.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 922f6d0d722ab737ed2792e333330398ff999b210f98fd40a444c8a58359fe20 |
|
MD5 | 632c58cca20332e68e968180a3dd44d4 |
|
BLAKE2b-256 | fb639598516c9966bf5db52b7599178ab271039ee620a5ca3b0300b04fca7712 |