Skip to main content

Dynamically generates and validates Python Airflow DAG file based on a Jinja2 Template and a YAML configuration file to encourage code re-usability

Project description

What is AirflowDAGGenerator?

Dynamically generates Python Airflow DAG file based on given Jinja2 Template and YAML configuration to encourage reusable code. It also validates the correctness (by checking DAG contains cyclic dependency between tasks, invalid tasks, invalid arguments, typos etc.) of the generated DAG automatically by leveraging airflow DagBag, therefore it ensures the generated DAG is safe to deploy into Airflow.

Why is it useful?

Most of the time the Data processing DAG pipelines are same except the parameters like source, target, schedule interval etc. So having a dynamic DAG generator using a templating language can greatly benefit when you have to manage a large number of pipelines at enterprise level. Also it ensures code re-usability and standardizing the DAG, by having a standardized template. It also improves the maintainability and testing effort.

How is it Implemented?

By leveraging the de-facto templating language used in Airflow itself, that is Jinja2 and the standard YAML configuration to provide the parameters specific to a use case while generating the DAG.

Requirements

Python 3.6 or later

Note: Tested on 3.6, 3.7 and 3.8 python environments, see tox.ini for details

How to use this Package?

  1. First install the package using:

pip install airflowdaggenerator
  1. Airflow Dag Generator should now be available as a command line tool to execute. To verify run

airflowdaggenerator -h
  1. Airflow Dag Generator can also be run as follows:

python -m airflowdaggenerator -h

Sample Usage:

If you have installed the package then:
airflowdaggenerator \
    -config_yml_path path/to/config_yml_file \
    -config_yml_file_name  config_yml_file \
    -template_path path/to/jinja2_template_file \
    -template_file_name jinja2_template_file \
    -dag_path path/to/generated_output_dag_py_file \
    -dag_file_name generated_output_dag_py_file
OR
python -m airflowdaggenerator \
          -config_yml_path path/to/config_yml_file \
          -config_yml_file_name  config_yml_file \
          -template_path path/to/jinja2_template_file \
          -template_file_name jinja2_template_file \
          -dag_path path/to/generated_output_dag_py_file \
          -dag_file_name generated_output_dag_py_file

If you have cloned the project source code then you have sample jinja2 template and YAML configuration file present under tests/data folder, so you can test the behaviour by opening a terminal window under project root directory and run the following command:

python -m airflowdaggenerator \
          -config_yml_path ./tests/data \
          -config_yml_file_name dag_properties.yml \
          -template_path ./tests/data \
          -template_file_name sample_dag_template.py.j2 \
          -dag_path ./tests/data/output \
          -dag_file_name test_dag.py

And you can see that test_dag.py is created under ./tests/data/output folder.

Troubleshooting

In case you get some error while generating the dag using this package like (sqlite3.OperationalError)…, then please execute following command:

airflow initdb

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airflowdaggenerator-0.0.2.tar.gz (4.7 kB view details)

Uploaded Source

Built Distribution

airflowdaggenerator-0.0.2-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file airflowdaggenerator-0.0.2.tar.gz.

File metadata

  • Download URL: airflowdaggenerator-0.0.2.tar.gz
  • Upload date:
  • Size: 4.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.1

File hashes

Hashes for airflowdaggenerator-0.0.2.tar.gz
Algorithm Hash digest
SHA256 c4301ffaa73a33abe04ec5205d20ebaeeb18e38b8f0da9cdcace61500986d1b3
MD5 951637bbff299a93f4604f5f5f92081a
BLAKE2b-256 c906212da7e752966dd2f1763f65f115b19f77fad1a6f304e59a272ec895f808

See more details on using hashes here.

File details

Details for the file airflowdaggenerator-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: airflowdaggenerator-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.1

File hashes

Hashes for airflowdaggenerator-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 306de6fb48efd90a207be0954f41935e3a2792e1a9527993c92cb8c380b8eb2d
MD5 93e7ddc690ee3c17f319324cc04025d5
BLAKE2b-256 34039c0dda5197a4e2fac322ee99f4da265e092b2e583368ef30bdf832d5c658

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page