Skip to main content

Airflow metrics to Google BigQuery

Project description

Airflow Metrics to BigQuery

build release PyPI PyPI - License

Sends airflow metrics to Bigquery


Installation

pip install airflow-metrics-gbq

Usage

  1. Activate statsd metrics in airflow.cfg
[metrics]
statsd_on = True
statsd_host = localhost
statsd_port = 8125
statsd_prefix = airflow
  1. Restart the webserver and the scheduler
systemctl restart airflow-webserver.service
systemctl restart airflow-scheduler.service
  1. Check that airflow is sending out metrics:
nc -l -u localhost 8125
  1. Install this package
  2. Create required tables (counters, gauges and timers), an example is shared here
  3. Create materialized views which refresh when the base table changes, as describe here
  4. Create a simple python script monitor.py to provide configuration:
from airflow_metrics_gbq.metrics import AirflowMonitor

if __name__ == '__main__':
    monitor = AirflowMonitor(
        host="localhost", # Statsd host (airflow.cfg)
        port=8125, # Statsd port (airflow.cfg)
        gcp_credentials="path/to/service/account.json",
        dataset_id="monitoring", # dataset where the monitoring tables are
        counts_table="counts", # counters table
        last_table="last", # gauges table
        timers_table="timers" # timers table
    )
    monitor.run()
  1. Run the program, ideally in the background to start sending metrics to BigQuery:
python monitor.py &
  1. The logs can be viewed in the GCP console under the airflow_monitoring app_name in Google Cloud Logging.

Future releases

  • Increase test coverage (unit and integration tests)
  • Add proper typing and mypy support and checks
  • Provide more configurable options
  • Provide better documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airflow_metrics_gbq-0.1.0.tar.gz (7.4 kB view details)

Uploaded Source

Built Distribution

airflow_metrics_gbq-0.1.0-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file airflow_metrics_gbq-0.1.0.tar.gz.

File metadata

  • Download URL: airflow_metrics_gbq-0.1.0.tar.gz
  • Upload date:
  • Size: 7.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.11.2 Darwin/21.6.0

File hashes

Hashes for airflow_metrics_gbq-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c28afee21ee86eb86bec9437b6dbf7ccba4f92e43cdeafb5813acf16538d30ad
MD5 33b0e09a1bbe5769d29f6986f3474b61
BLAKE2b-256 8487f54eb4a13169963eb49c829c4a65e5fc0b875ef84f471c2c9df620b2f9a5

See more details on using hashes here.

File details

Details for the file airflow_metrics_gbq-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for airflow_metrics_gbq-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e811375b8306eb26a8f17383e53cf45f6b56ab82009ef979daa321246df5aabb
MD5 acd032dfee7ed449e3132fb847fb79a4
BLAKE2b-256 3a5be466183cd28a188672d1fd118c04d5b246fa92881825cd7fbb5e23db914c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page