Skip to main content

Fast & Fake Backfill Airflow DAGs Status

Project description

Airflow Fakefill Marker

Due to migrating to Kubernetes-host Airflow and using different backend, we need to find out a way to fill out all the history since its starting date for thousands of dags. To make this process going faster and easier, in the meantime, I didn't find this kind of tool on Github, so I implement this simple tool to help with marking dags as success. Hope it can also help others.

Installations

Method 1

$ pip install fakefill

Method 2

$ pip install git+https://git@github.com/benbenbang/airflow_fastfill.git

Method 3

$ git clone git@github.com:benbenbang/airflow_fastfill.git
$ cd airflow_fastfill
$ pip install .

Usages

$ fakefill

It takes 1 of 2 required argument, and 6 optional arguments. You can also define them in a yaml file and pass to the cli.

  • Options

    • Required [1 / 2]:

      • dag_id [-d][reqired]: can be a real dag id or "all" to fill all the dags
      • config_path [-cp][choose one]: path to the config yaml
    • Optional:

      • start_date [-sd]: starting date, default will be counted from 365 days ago
      • maximum_day [-md]: maximum fill date per dag, rangint: [1, 180]
      • maximum_unit [-mu]: maxium fill unit per dag, rangint: [1, 43200]
      • ignore [-i]: still procceed auto fill even the dag ran recently
      • pause_only [-p]: pass true to fill dags which are pause
      • confirm [-y]: pass true to bypass the prompt if dag_id is all
      • traceback [-v]: pass print our Airflow Database error

Examples

Fill all the dags for the past 30 days without prompt, and only fill if all the dags which have status == pause

$ fakefill -d all -p -md 30 -y

Run fastfill for dag id == dag_a by counting default fakefill days == 365

$ fakefill -d dag_a

Run fastfill with config yaml

$ fakefill -cp config.yml

The yaml file needs to be defined with two dictonary types: dags and settings. For dags section, it needs to be a list, while the settingssection is dict

Sample:

dags:
  - dag_a
  - dag_b
  - dag_c

settings:
  start_date: 2019-01-01
  maximum: "365"
  traceback: false
  confirm: true
  pause_only: true

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fakefill-1.0.0.tar.gz (13.2 kB view hashes)

Uploaded source

Built Distribution

fakefill-1.0.0-py3-none-any.whl (15.0 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page