Dry run dbt projects

These details have not been verified by PyPI

Project description

dbt-dry-run

dbt is a tool that helps manage data transformations using templated SQL queries. These SQL queries are executed against a target data warehouse. It doesn't check the validity of SQL queries before it executes your project. This dry runner uses BigQuery's dry run capability to allow you to check that SQL queries are valid before trying to execute them.

See the blog post for more information on how the dry runner works.

Terminal Recording of failing dry run

Quickstart

Installation

The dry runner can be installed via pip:

pip install dbt-dry-run

Running

The dry runner has a single command called dbt-dry-run in order for it to run you must first compile a dbt manifest using dbt compile.

How much of the project should I compile?

It is best practice to compile the entire dbt project when supplying a manifest for dry run. The dry run loops through your project in the DAG order (staging -> intermediate -> mart) based on `ref` and predicts the schema of each model as it progresses. If you dry run `marts` but have not compiled `staging` then it cannot determine if `marts` will run as it does not know the predicted schema of the upstream models and you will see `NotCompiledException` in the dry run output.

Then on the same machine (So that the dry runner has access to your dbt project source and the manifest.yml) you can run the dry-runner in the same directory as our dbt_project.yml:

dbt-dry-run

Like dbt it will search for profiles.yml in ~/.dbt/ and use the default target specified. Just like in the dbt CLI you can override these defaults:

dbt-dry-run default --project-dir /my_org_dbt/ --profiles-dir /my_org_dbt/profiles/ --target local

The full CLI help is shown below, anything prefixed with [dbt] can be used in the same way as a normal dbt parameter:

  ❯ dbt-dry-run --help
   Usage: dbt-dry-run [OPTIONS]
   
   Options:
     --profiles-dir TEXT             [dbt] Where to search for `profiles.yml`
                                     [default: /Users/connor.charles/.dbt]
     --project-dir TEXT              [dbt] Where to search for `dbt_project.yml`
                                     [default: /Users/connor.charles/Code/dbt-
                                     dry-run]
     --vars TEXT                     [dbt] CLI Variables to pass to dbt
                                     [default: {}]
     --target TEXT                   [dbt] Target profile
     --target-path TEXT              [dbt] Target path
     --verbose / --no-verbose        Output verbose error messages  [default: no-
                                     verbose]
     --report-path TEXT              Json path to dump report to
     --skip-not-compiled             Whether or not the dry run should ignore
                                     models that are not compiled. This has
                                     several caveats that make this not a
                                     recommended option. The dbt manifest should
                                     generally be compiled with `--select *` to
                                     ensure good  coverage
     --full-refresh                  [dbt] Full refresh
     --extra-check-columns-metadata-key TEXT
                                     An extra metadata key that can be used in
                                     place of `dry_run.check_columns` for
                                     verifying column metadata has been specified
                                     correctly. `dry_run.check_columns` will
                                     always take precedence. The metadata key
                                     should be of boolean type or it will be cast
                                     to a boolean to be 'True/Falsey`
     --version
     --install-completion [bash|zsh|fish|powershell|pwsh]
                                     Install completion for the specified shell.
     --show-completion [bash|zsh|fish|powershell|pwsh]
                                     Show completion for the specified shell, to
                                     copy it or customize the installation.
     --help                          Show this message and exit.

Reporting Results & Failures

If the result is successful it will output the number of models that were tested like so:

Dry running 3 models

DRY RUN SUCCESS!

The process will also return exit code 0

If there are failures it will print a summary table of the nodes that failed:

Dry running 3 models
Node model.test_models_with_invalid_sql.second_layer failed with exception:
400 POST https://bigquery.googleapis.com/...: Column d in USING clause not found on left side of join at [6:88]

(job ID: 5e336f32-273d-480a-b8bb-cdf4fca66a98)

Total 1 failures:
1       :       model.test_models_with_invalid_sql.second_layer :       BadRequest      :       ERROR
DRY RUN FAILURE!`

The process will also return exit code 1

Column and Metadata Linting

The dry runner can also be configured to inspect your metadata YAML and assert that the predicted schema of your dbt projects data warehouse matches what is documented in the metadata. To enable this for your models specify the key dry_run.check_columns: true. The dry runner will then fail if the model's documentation does not match. You can also specify a custom extra key to enable check_columns by setting the CLI argument --extra-check-columns-metadata-key. For example the full metadata for this model:

models:
  - name: badly_documented_model
    description: This model is missing some columns in its docs
    config:
      meta:
        dry_run.check_columns: true
    columns:
      - name: a
        description: This is in the model

      - name: b
        description: This is in the model

      #      - name: c
      #        description: Forgot to document c

      - name: d
        description: This shouldn't be here

This model is badly documented as the predicted schema is 3 columns a,b,c the dry runner will therefore output the following error and fail your CI/CD checks:

Dry running X models
Node model.test_column_linting.badly_documented_model failed linting with rule violations:
        UNDOCUMENTED_COLUMNS : Column not documented in metadata: 'c'
        EXTRA_DOCUMENTED_COLUMNS : Extra column in metadata: 'd'

Total 1 failures:
1       :       model.test_column_linting.badly_documented_model        :       LINTING :       ERROR
DRY RUN FAILURE!

Currently, these rules can cause linting failures:

UNDOCUMENTED_COLUMNS: The predicted schema of the model will have extra columns that have not been documented in the YAML
EXTRA_DOCUMENTED_COLUMNS: The predicted schema of the model does not have this column that was specified in the metadata

Usage with dbt-external-tables

The dbt package dbt-external-tables gives dbt support for staging and managing external tables. These sources do not produce any compiled sql in the manifest, so it is not possible for the dry runner to predict their schema. Therefore, you must specify the resulting schema manually in the metadata of the source.

However, if the columns schema is already defined under the name in the yaml config, you do not need to specify dry_run_columns under external. The dry runner will use the columns schema if dry_run_columns is not specified. This avoids duplicated schema definitions.

For example if you were import data from a gcs bucket:

version: 2

sources:
  - name: source_dataset
    tables:
      - name: event
        description: "Some events bucket. If external is populated then the dry runner will assume it is using `dbt-external-tables`"
        external:
          location: 'gs://bucket/path/*'
            format: csv

           dry_run_columns:
             - name: string_field
               data_type: STRING
               description: "Specify each column in the yaml for external sources"
             - name: record_array_field[]
               data_type: RECORD[]
               description: "For struct/record fields specify the `data_type` as `RECORD`"
             - name: record_array_field.foo
               data_type: NUMERIC
               description: "For record attributes use the dot notation"
             - name: integer_array
               data_type: NUMERIC[]
               description: "For repeated fields suffix data_type with []"

The dry runner cannot predict the schema, therefore, it is up to you to accurately describe the schema in the YAML otherwise you may get false positive/negative results from the dry run.

Report Artefact

If you specify ---report-path a JSON file will be outputted regardless of dry run success/failure with detailed information of each node's predicted schema or error message if it has failed:

{
  "success": false,
  "node_count": 3,
  "failure_count": 1,
  "failed_node_ids": [
    "model.test_models_with_invalid_sql.second_layer"
  ],
  "nodes": [
    {
      "unique_id": "seed.test_models_with_invalid_sql.my_seed",
      "success": true, 
      "status": "SUCCESS",
      "error_message": null,
      "table": {
        "fields": [
          ...
        ]
      }
    },
    {
      "unique_id": "model.test_models_with_invalid_sql.first_layer",
      "success": true,
      "status": "SUCCESS",
      "error_message": null,
      "table": {
        "fields": [
          ...
        ]
      }
    },
    {
      "unique_id": "model.test_models_with_invalid_sql.second_layer",
      "success": false,
      "status": "FAILURE",
      "error_message": "BadRequest",
      "table": null
    }
  ]
}

Capabilities and Limitations

Things this can catch

The dry run can catch anything the BigQuery planner can identify before the query has run. Which includes:

Typos in SQL keywords: selec instead of select
Typos in columns names: orders.produts instead of orders.products
Problems with incompatible data types: Trying to execute "4" + 4
Incompatible schema changes to models: Removing a column from a view that is referenced by a downstream model explicitly
Incompatible schema changes to sources: Third party modifies schema of source tables without your knowledge
Permission errors: The dry runner should run under the same service account your production job runs under. This allows you to catch problems with table/project permissions as dry run queries need table read permissions just like the real query
Incorrect configuration of snapshots: For example a typo in the unique_key config. Or check_cols which do not exist in the snapshot

Things this can't catch

There are certain cases where a syntactically valid query can fail due to the data in the tables:

Queries that run but do not return intended/correct result. This is checked using tests
NULL values in ARRAY_AGG (See IGNORE_NULLS bullet point)
Bad query performance that makes it too complex/expensive to run

In an incremental table, it is not possible to change the data type of a nested field within a RECORD. dbt-dry-run will not flag such a change as a failure.

The dry runner will not test schema changes for materialized views. It will only test the syntax of the SQL query.

License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an " AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.10.1

Jul 8, 2026

0.10.0

Jul 2, 2026

0.9.1

Jul 1, 2026

0.9.0

May 12, 2026

0.8.7

Apr 20, 2026

0.8.6

Dec 15, 2025

0.8.5

Nov 27, 2025

0.8.4

Nov 26, 2025

0.8.3

Apr 2, 2025

0.8.2

Feb 4, 2025

0.8.1

Jan 9, 2025

0.8.0

Jul 16, 2024

0.7.8

May 30, 2024

0.7.7

May 3, 2024

0.7.5

Jan 10, 2024

0.7.4

Jan 10, 2024

0.7.3

Dec 19, 2023

0.7.2

Nov 16, 2023

0.7.1

Oct 16, 2023

0.7.0

Oct 11, 2023

0.6.8

Aug 1, 2023

0.6.7

May 2, 2023

0.6.6

Apr 13, 2023

0.6.5

Feb 9, 2023

0.6.4

Feb 3, 2023

0.6.3

Jan 19, 2023

0.6.2

Nov 25, 2022

0.6.1

Nov 24, 2022

0.6.0

Nov 23, 2022

0.5.1

Oct 27, 2022

0.5.0

Oct 21, 2022

0.4.2

Oct 18, 2022

0.4.1

Oct 11, 2022

0.4.0

Aug 10, 2022

0.3.1

Aug 1, 2022

0.3.0

Jul 4, 2022

0.2.0

Jun 9, 2022

0.1.7

May 4, 2022

0.1.6

Apr 6, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_dry_run-0.10.1.tar.gz (380.0 kB view details)

Uploaded Jul 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dbt_dry_run-0.10.1-py3-none-any.whl (70.6 kB view details)

Uploaded Jul 8, 2026 Python 3

File details

Details for the file dbt_dry_run-0.10.1.tar.gz.

File metadata

Download URL: dbt_dry_run-0.10.1.tar.gz
Upload date: Jul 8, 2026
Size: 380.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dbt_dry_run-0.10.1.tar.gz
Algorithm	Hash digest
SHA256	`0aacbd5105e08d9814666c42acd4a57d1a858d8563ccd5b5b79d632d68fe2f17`
MD5	`10abf479743945f52c89a81a148f558b`
BLAKE2b-256	`2cb9d496e39fb17dc7854f62140b5ac93a4da6560f1df39b8c3a9fb7adcb0bde`

See more details on using hashes here.

File details

Details for the file dbt_dry_run-0.10.1-py3-none-any.whl.

File metadata

Download URL: dbt_dry_run-0.10.1-py3-none-any.whl
Upload date: Jul 8, 2026
Size: 70.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dbt_dry_run-0.10.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`844d50bbbfca5ca7e0e54519943214b72d6affd0c154d10e7d8b7d1ae0fa0cfe`
MD5	`221b20f5a08a8f9f6ffdf85d89fc158d`
BLAKE2b-256	`d60a7967596c87a8a9f8d82557471ff0b1b7278d3282eb5901e2768ff953888f`

See more details on using hashes here.

dbt-dry-run 0.10.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

dbt-dry-run

Quickstart

Installation

Running

Reporting Results & Failures

Column and Metadata Linting

Usage with dbt-external-tables

Report Artefact

Capabilities and Limitations

Things this can catch

Things this can't catch

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes