Skip to main content

Distibuted dbt runs on Apache Airflow

Project description

PyPI - Version GitHub Build

License PyPI - Python Version PyPI - Downloads

Ruff Checked with mypy

dmp-af: distributed dbt runs on Airflow

Overview

dmp-af runs your dbt models in parallel on Airflow. Each model becomes an independent task while preserving dependencies across domains.

Built for scale. Designed for large dbt projects (1000+ models) and data mesh architecture. Works with any project size.

dmp-af dbt-af3

Why dmp-af?

  1. Domain-driven architecture - Separate models by domain into different DAGs, run in parallel, perfect for data mesh
  2. dbt-first design - All configuration in dbt model configs, analytics teams stay in dbt, no Airflow knowledge required
  3. Flexible scheduling - Multiple schedules per model (@hourly, @daily, @weekly, @monthly, and more)
  4. Enterprise features - Multiple dbt targets, configurable test strategies, built-in maintenance, Kubernetes support

Installation

To install dmp-af run pip install dmp-af.

To contribute we recommend to use uv to install package dependencies. Run uv sync --all-packages --all-groups --all-extras to install all dependencies.

dmp-af by Example

All tutorials and examples are located in the examples folder.

To get basic Airflow DAGs for your dbt project, you need to put the following code into your dags folder:

# LABELS: dag, airflow (it's required for airflow dag-processor)
from dmp_af.dags import compile_dmp_af_dags
from dmp_af.conf import Config, DbtDefaultTargetsConfig, DbtProjectConfig

# specify here all settings for your dbt project
config = Config(
    dbt_project=DbtProjectConfig(
        dbt_project_name='my_dbt_project',
        dbt_project_path='/path/to/my_dbt_project',
        dbt_models_path='/path/to/my_dbt_project/models',
        dbt_profiles_path='/path/to/my_dbt_project',
        dbt_target_path='/path/to/my_dbt_project/target',
        dbt_log_path='/path/to/my_dbt_project/logs',
        dbt_schema='my_dbt_schema',
    ),
    dbt_default_targets=DbtDefaultTargetsConfig(default_target='dev'),
    dry_run=False,  # set to True if you want to turn on dry-run mode
)

dags = compile_dmp_af_dags(
    manifest_path='/path/to/my_dbt_project/target/manifest.json',
    config=config,
)
for dag_name, dag in dags.items():
    globals()[dag_name] = dag

In dbt_project.yml you need to set up default targets for all nodes in your project (see example):

sql_cluster: "dev"
daily_sql_cluster: "dev"
py_cluster: "dev"
bf_cluster: "dev"

This will create Airflow DAGs for your dbt project.

Check out the documentation for more details here.

Key Features

Auto-generated DAGs

  • Automatically creates Airflow DAGs from your dbt project
  • Organizes by domain and schedule
  • Handles dependencies across domains

Idempotent runs

  • Each model is a separate Airflow task
  • Date intervals passed to every run
  • Reliable backfills and reruns

Team-friendly

  • Analytics teams stay in dbt
  • No Airflow DAG writing required
  • Infrastructure handled automatically

Requirements

dmp-af is tested with:

Airflow version Python versions dbt-core versions
2.6.3 ≥3.10,<3.12 ≥1.7,<=1.10
2.7.3 ≥3.10,<3.12 ≥1.7,<=1.10
2.8.4 ≥3.10,<3.12 ≥1.7,<=1.10
2.9.3 ≥3.10,<3.13 ≥1.7,<=1.10
2.10.5 ≥3.10,<3.13 ≥1.7,<=1.10
2.11.0 ≥3.10,<3.13 ≥1.7,<=1.10
3.0.6 ≥3.10,<3.13 ≥1.7,≤1.10
3.1.3 ≥3.10,<3.14 ≥1.7,≤1.10
3.2.2 ≥3.10,<3.14 ≥1.7,≤1.10

Project Information

About this fork

This project is a fork of Toloka AI BV's original repository. It includes substantial modifications by IJKOS & PARTNERS LTD. This fork is not affiliated with or endorsed by Toloka AI BV.

The original project is licensed under the Apache License 2.0.

Migrating from dbt-af

If you're currently using dbt-af and want to migrate to dmp-af, see our Migration Guide for step-by-step instructions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dmp_af-0.17.0.tar.gz (44.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dmp_af-0.17.0-py3-none-any.whl (57.6 kB view details)

Uploaded Python 3

File details

Details for the file dmp_af-0.17.0.tar.gz.

File metadata

  • Download URL: dmp_af-0.17.0.tar.gz
  • Upload date:
  • Size: 44.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dmp_af-0.17.0.tar.gz
Algorithm Hash digest
SHA256 2539288cd2a29b0bdc51f7cb72438d135a8ea24d2c94ff96b954d336ea8d93f0
MD5 fd3f136d970102dec351450275d199fd
BLAKE2b-256 3a7d6986f82fdb362eb4d6522427ff69e7d58793e1c79fe9d728f0abe33b21f6

See more details on using hashes here.

Provenance

The following attestation bundles were made for dmp_af-0.17.0.tar.gz:

Publisher: semantic-release.yml on dmp-labs/dmp-af

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dmp_af-0.17.0-py3-none-any.whl.

File metadata

  • Download URL: dmp_af-0.17.0-py3-none-any.whl
  • Upload date:
  • Size: 57.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dmp_af-0.17.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d128bd629bd0889ff9da567154b882d3ed9e9ea89f36837398c73a07492ab83b
MD5 5bbb33a62a00bbe952db486732ca71fe
BLAKE2b-256 e4262b03a3cb4b6cb5ed59fb21c2aea56cf9b2ecbce1217e3f0673e9a9a26625

See more details on using hashes here.

Provenance

The following attestation bundles were made for dmp_af-0.17.0-py3-none-any.whl:

Publisher: semantic-release.yml on dmp-labs/dmp-af

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page