Skip to main content

Airflow metrics to Google BigQuery

Project description

Airflow Metrics to BigQuery

build release PyPI PyPI - License

Sends airflow metrics to Bigquery


Installation

pip install airflow-metrics-gbq

Usage

  1. Activate statsd metrics in airflow.cfg
[metrics]
statsd_on = True
statsd_host = localhost
statsd_port = 8125
statsd_prefix = airflow
  1. Restart the webserver and the scheduler
systemctl restart airflow-webserver.service
systemctl restart airflow-scheduler.service
  1. Check that airflow is sending out metrics:
nc -l -u localhost 8125
  1. Install this package
  2. Create required tables (counters, gauges and timers), an example is shared here
  3. Create materialized views which refresh when the base table changes, as describe here
  4. Create a simple python script monitor.py to provide configuration:
from airflow_metrics_gbq.metrics import AirflowMonitor

if __name__ == '__main__':
    monitor = AirflowMonitor(
        host="localhost", # Statsd host (airflow.cfg)
        port=8125, # Statsd port (airflow.cfg)
        gcp_credentials="path/to/service/account.json",
        dataset_id="monitoring", # dataset where the monitoring tables are
        counts_table="counts", # counters table
        last_table="last", # gauges table
        timers_table="timers" # timers table
    )
    monitor.run()
  1. Run the program, ideally in the background to start sending metrics to BigQuery:
python monitor.py &
  1. The logs can be viewed in the GCP console under the airflow_monitoring app_name in Google Cloud Logging.

Future releases

  • Add a buffer (pyzmq or mp queue)
  • Run sending metrics to GBQ in another process
  • Add proper typing and mypy support and checks
  • Provide more configurable options
  • Provide better documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airflow_metrics_gbq-0.0.4a0.tar.gz (6.1 kB view hashes)

Uploaded Source

Built Distribution

airflow_metrics_gbq-0.0.4a0-py3-none-any.whl (8.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page