Skip to main content

Apache Airflow integration for dbt

Project description

airflow-dbt

This is a collection of Airflow operators to provide easy integration with dbt.

from airflow import DAG
from airflow_dbt.operators.dbt_operator import DbtRunOperator, DbtTestOperator
from airflow.utils.dates import days_ago

default_args = {
  'dir': '/srv/app/dbt',
  'start_date': days_ago(0)
}

with DAG(dag_id='dbt', default_args=default_args, schedule_interval='@daily') as dag:

  dbt_snapshot = DbtSnapshotOperator(
    task_id='dbt_snapshot',
  )

  dbt_run = DbtRunOperator(
    task_id='dbt_run',
  )

  dbt_test = DbtTestOperator(
    task_id='dbt_test',
    retries=0,  # Failing tests would fail the task, and we don't want Airflow to try again
  )

  dbt_snapshot >> dbt_run >> dbt_test

Installation

Install from PyPI:

pip install airflow-dbt

It will also need access to the dbt CLI, which should either be on your PATH or can be set with the dbt_bin argument in each operator.

Usage

There are three operators currently implemented:

Each of the above operators accept the following arguments:

  • profiles_dir
    • If set, passed as the --profiles-dir argument to the dbt command
  • target
    • If set, passed as the --target argument to the dbt command
  • dir
    • The directory to run the dbt command in
  • full_refresh
    • If set to True, passes --full-refresh
  • vars
    • If set, passed as the --vars argument to the dbt command. Should be set as a Python dictionary, as will be passed to the dbt command as YAML
  • models
    • If set, passed as the --models argument to the dbt command
  • exclude
    • If set, passed as the --exclude argument to the dbt command
  • dbt_bin
    • The dbt CLI. Defaults to dbt, so assumes it's on your PATH
  • verbose
    • The operator will log verbosely to the Airflow logs

Typically you will want to use the DbtRunOperator, followed by the DbtTestOperator, as shown earlier.

You can also use the hook directly. Typically this can be used for when you need to combine the dbt command with another task in the same operators, for example running dbt docs and uploading the docs to somewhere they can be served from.

Building Locally

To install from the repository: First it's recommended to create a virtual environment:

python3 -m venv .venv

source .venv/bin/activate

Install using pip:

pip install .

Testing

To run tests locally, first create a virtual environment (see Building Locally section)

Install dependencies:

pip install . pytest

Run the tests:

pytest tests/

Code style

This project uses flake8.

To check your code, first create a virtual environment (see Building Locally section):

pip install flake8
flake8 airflow_dbt/ tests/ setup.py

Package management

If you use dbt's package manager you should include all dependencies before deploying your dbt project.

For Docker users, packages specified in packages.yml should be included as part your docker image by calling dbt deps in your Dockerfile.

License & Contributing

GoCardless ♥ open source. If you do too, come join us.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airflow_dbt-0.2.0.tar.gz (6.5 kB view details)

Uploaded Source

Built Distribution

airflow_dbt-0.2.0-py2.py3-none-any.whl (7.0 kB view details)

Uploaded Python 2Python 3

File details

Details for the file airflow_dbt-0.2.0.tar.gz.

File metadata

  • Download URL: airflow_dbt-0.2.0.tar.gz
  • Upload date:
  • Size: 6.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for airflow_dbt-0.2.0.tar.gz
Algorithm Hash digest
SHA256 bb6f33eee9515fe586871bdcd53325fc3ae1926aa3b986aef90ccea6452de874
MD5 d0991cd498cb2c9b0cf834f6196d19d7
BLAKE2b-256 ec43cf30211f6bef4dc42faf3fb09df70f182181e97103ff3c164073b9dc476d

See more details on using hashes here.

File details

Details for the file airflow_dbt-0.2.0-py2.py3-none-any.whl.

File metadata

  • Download URL: airflow_dbt-0.2.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for airflow_dbt-0.2.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 0ea8a9770eed27cd1f2021ced5ed42a92382b613bb51aee781b977a20eb16a53
MD5 1504894631d7b5162b9eaddb7e69e956
BLAKE2b-256 f629d4f7b5a50372bada3d394af6206fe38e74d23afc06265e7bf0eff72a48bb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page