Static budget checks and hash stability checks for Apache Airflow DAG files.
Project description
airflow-dag-audit
airflow-dag-audit provides static checks for DAG files and an optional reparse hash check.
It is designed for local development, pytest, and command-line use.
The package does not start scheduler, triggerer, webserver, or database services. It focuses on:
- AST-based counts such as imports,
Variable.get(...)calls, SQL-like string literals, and top-level calls - a hash stability check that reparses the same DAG file twice
- a pytest plugin for project defaults and per-test overrides
- a CLI that can be installed as a tool and executed with
uvx
Trademark notice
Apache Airflow, Apache, and related marks belong to The Apache Software Foundation. This project is not affiliated with, endorsed by, or sponsored by The Apache Software Foundation. It is a third-party helper package for DAG repositories.
Installation
Library dependency
uv add airflow-dag-audit
If the environment that runs the hash check does not already have Apache Airflow installed, you can install the optional extra:
uv add 'airflow-dag-audit[airflow]'
CLI tool
After publishing to PyPI, the CLI can be executed without creating a project environment:
uvx airflow-dag-audit --help
What is checked
Static metrics
The AST analysis currently reports:
import_countvariable_get_countforVariable.get(...)sql_query_countfor string literals that look like SQLtop_level_call_countdetected_dag_decorators
Stable hash
With require_stable_hash=True, the package reparses a DAG file twice and compares canonical serialized payloads.
- If Apache Airflow is importable, the worker tries to serialize matching DAG objects with
SerializedDAG.to_dict(...). - Otherwise, it falls back to a generic serializer for DAG-like Python objects.
The check is useful for detecting DAG definitions that mutate during import or serialization.
Python API
Basic assertion
from pathlib import Path
from airflow_dag_audit import DagAuditConfig, assert_dag_budget
config = DagAuditConfig(
max_imports=20,
max_variable_gets=2,
max_sql_queries=3,
max_top_level_calls=10,
require_stable_hash=True,
)
assert_dag_budget(Path("dags/example_dag.py"), config=config)
Non-raising inspection
from airflow_dag_audit import DagAuditConfig, audit_dag_file
result = audit_dag_file(
"dags/example_dag.py",
config=DagAuditConfig(max_imports=20, require_stable_hash=True),
)
print(result.ok)
print(result.metrics.as_dict())
if result.hash_result:
print(result.hash_result.first_hashes)
Pytest usage
The package exposes a pytest plugin through the pytest11 entry point.
Project defaults in pyproject.toml
The package supports layered defaults in [tool.airflow-dag-audit].
Use [tool.airflow-dag-audit.budget] for global limits and [[tool.airflow-dag-audit.overrides]]
for per-glob or per-file exceptions. Later matching overrides win.
[tool.airflow-dag-audit]
dag_folder = "dags"
include = ["**/*.py"]
exclude = ["**/__pycache__/**", "**/tests/**"]
check_hash_stability = true
hash_parse_repeats = 2
[tool.airflow-dag-audit.budget]
imports = 40
import_froms = 25
variable_get_calls = 10
connection_get_calls = 6
airflow_query_calls = 8
operators = 80
tasks = 150
[[tool.airflow-dag-audit.overrides]]
match = "dags/legacy/*.py"
[tool.airflow-dag-audit.overrides.budget]
imports = 80
variable_get_calls = 25
connection_get_calls = 20
airflow_query_calls = 20
tasks = 300
[[tool.airflow-dag-audit.overrides]]
match = "dags/legacy/specific_bad_but_known.py"
check_hash_stability = false
[tool.airflow-dag-audit.overrides.budget]
imports = 160
import_froms = 90
variable_get_calls = 40
connection_get_calls = 35
airflow_query_calls = 30
operators = 250
tasks = 700
If files is omitted, the package scans dag_folder using include and exclude.
[tool.pytest.ini_options]
airflow_dag_audit_dag_folder = "dags"
airflow_dag_audit_dag_files = [
"dags/example_good.py",
"dags/example_unstable.py",
]
airflow_dag_audit_max_imports = "20"
airflow_dag_audit_max_variable_gets = "2"
airflow_dag_audit_max_sql_queries = "3"
airflow_dag_audit_max_top_level_calls = "10"
airflow_dag_audit_require_stable_hash = "true"
Test code
from airflow_dag_audit import assert_dag_budget
def test_dag_budget(dag_file, dag_audit_config) -> None:
assert_dag_budget(dag_file, config=dag_audit_config)
Per-test overrides
import pytest
from airflow_dag_audit import assert_dag_budget
@pytest.mark.airflow_dag_budget(max_imports=8, require_stable_hash=False)
def test_small_dag(dag_file, dag_audit_config) -> None:
assert_dag_budget(dag_file, config=dag_audit_config)
Command-line overrides
uv run pytest \
--airflow-dag-file dags/example_good.py \
--airflow-dag-folder dags \
--airflow-dag-max-imports 20 \
--airflow-dag-max-variable-gets 2 \
--airflow-dag-max-sql-queries 3 \
--airflow-dag-max-top-level-calls 10 \
--airflow-dag-require-stable-hash
Why --airflow-dag-folder exists
When you point to a single DAG file, the package tries to infer a useful DAG folder automatically.
That works for common layouts, especially when the file lives under a dags/ directory or inside a Python package.
If the file relies on sibling modules, package-relative imports, or a non-standard repository layout, pass --airflow-dag-folder explicitly.
The same applies to DagAuditConfig(dag_folder=...) in Python code.
CLI usage
Scan without failing the process
uvx airflow-dag-audit scan dags \
--max-imports 20 \
--max-variable-gets 2 \
--max-sql-queries 3 \
--max-top-level-calls 10
Fail on budget violations
uvx airflow-dag-audit assert dags \
--max-imports 20 \
--max-variable-gets 2 \
--max-sql-queries 3 \
--max-top-level-calls 10 \
--require-stable-hash
Check only the reparse hash
uvx airflow-dag-audit hash dags/example_unstable.py --dag-folder dags --show-diff
JSON output
uvx airflow-dag-audit scan dags --json
Use only pyproject.toml
uvx airflow-dag-audit assert
Development
Install dependencies
uv sync --group dev
Run tests
uv run pytest
Build distributions
uv run python -m build
Publishing from GitHub tags
The repository includes two workflows:
.github/workflows/ci.ymlfor tests and package build.github/workflows/publish.ymlfor PyPI publishing on version tags
The publishing workflow is written for PyPI Trusted Publishing. See the section after the ZIP artifact in the chat response for the PyPI and GitHub configuration steps.
Examples
The examples/ directory contains:
- a stable DAG-like file
- an unstable DAG-like file that changes hash across reparses
- a pytest example
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file airflow_dag_audit-0.1.0.tar.gz.
File metadata
- Download URL: airflow_dag_audit-0.1.0.tar.gz
- Upload date:
- Size: 187.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.13 {"installer":{"name":"uv","version":"0.9.13"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f2645c43a29fd8e90ede99ea0a6c92581c49964394769d564407fcc2042a485f
|
|
| MD5 |
073b4429dc154f80638a2fb2239159d4
|
|
| BLAKE2b-256 |
0d9d1100b562d10869b7e30bb12be311b4d266059f05889b71e212bd56c6d5a0
|
File details
Details for the file airflow_dag_audit-0.1.0-py3-none-any.whl.
File metadata
- Download URL: airflow_dag_audit-0.1.0-py3-none-any.whl
- Upload date:
- Size: 18.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.13 {"installer":{"name":"uv","version":"0.9.13"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7832bbc0367d8363f45b6fa325f93546ec9182412c03071466149c1597b922cf
|
|
| MD5 |
a689d48679e0e84a1f49319ecc0dcb44
|
|
| BLAKE2b-256 |
ada0de4e7d01c0ebfdcc13070ff4b1913d4b70367d41d45cba4858e26b5e0212
|