Skip to main content

Acceldata Airflow Listener Plugin

Project description

Overview

The Acceldata Listener plugin integrates Airflow DAGs for automatic observation in ADOC.

Features

The plugin performs the following actions without requiring additional code in your Airflow DAG, unless you disable instrumentation through environment variables.

  • When the DAG starts:

    • It creates the pipeline if it does not already exist in ADOC.
    • It creates a new pipeline run in ADOC.
  • When a TaskInstance starts:

    • It creates jobs in ADOC for each of the Airflow operators used in the task.
    • It constructs job input nodes based on the upstream tasks.
    • It creates a span and associates it with the jobs.
    • It emits span events with metadata.
  • When a TaskInstance is completed:

    • It emits span events with metadata.
    • It ends the spans with either success or failure.
  • When the DAG is completed:

    • It updates the pipeline run with success or failure in ADOC.

Prerequisites

Ensure the following applications are installed on your system:

API keys are essential for authentication when making calls to ADOC. You can generate API keys in the ADOC UI's Admin Central by visiting the API Keys section.

Configuration

Plugin Environment Variables

The adoc_airflow_plugin uses the acceldata-sdk to push data to the ADOC backend.

Mandatory Environment Variables: The ADOC client requires the following environment variables:

  • TORCH_CATALOG_URL: The URL of your ADOC Server instance.
  • TORCH_ACCESS_KEY: The API access key generated from the ADOC UI.
  • TORCH_SECRET_KEY: The API secret key generated from the ADOC UI.

Optional Environment Variables: By default, all DAGs are observed. However, the following environment variables can be set to modify this behavior.

Note: The variables for ignoring or observing DAGs are mutually exclusive.

  • If the following environment variables match specific DAG IDs, those DAGs will be ignored from observation, while all other DAGs will still be observed:

    • DAGIDS_TO_IGNORE: Comma-separated list of DAG IDs to ignore.
    • DAGIDS_REGEX_TO_IGNORE: Regular expression pattern for DAG IDs to ignore.
  • If the following environment variables match specific DAG IDs, only those DAGs will be observed, and all others will be ignored:

    • DAGIDS_TO_OBSERVE: Comma-separated list of DAG IDs to observe.
    • DAGIDS_REGEX_TO_OBSERVE: Regular expression pattern for DAG IDs to observe.
  • The following environment variables can be used to configure timeout settings for communication with the ADOC server:

    • TORCH_CONNECTION_TIMEOUT_MS: Maximum time (in milliseconds) to wait while establishing a connection to the ADOC server. Default: 5000 ms.
    • TORCH_READ_TIMEOUT_MS: Maximum time (in milliseconds) to wait for a response from the ADOC server after a successful connection. Default: 15000 ms.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adoc_airflow_plugin-26.2.0.tar.gz (10.4 kB view details)

Uploaded Source

File details

Details for the file adoc_airflow_plugin-26.2.0.tar.gz.

File metadata

  • Download URL: adoc_airflow_plugin-26.2.0.tar.gz
  • Upload date:
  • Size: 10.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.3

File hashes

Hashes for adoc_airflow_plugin-26.2.0.tar.gz
Algorithm Hash digest
SHA256 cb4570b13f9f98ab0d095241a09499d208cfa998d5dc9f9749d665bbdb83eeb4
MD5 4a3cdf70a810969fe5e1d6a2842c45b6
BLAKE2b-256 51b8eaf79d256dc3a56d2442913b0838df326c783d0049e5ab833ac3799fa0b9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page