Skip to main content

Orchestra wrapper for dbt Core - allows for stateful orchestration of dbt Core projects.

Project description

dbt-orchestra

Introduction

dbt-orchestra wraps dbt Core commands, use previous run state to reduce unnecessary work.

It is designed to be added to an existing dbt Core project, not used as a standalone dbt repository.

Compatibility and prerequisites

  • Python: 3.11, 3.12, and 3.13 only (see requires-python in pyproject.toml).
  • dbt-core: 1.10.x and 1.11.x when using stateful orchestration.
  • A dbt Core project: an existing dbt Core project where you already run dbt build / dbt run / dbt test.

Installing

  1. Install dbt-orchestra in the same environment as your dbt project:

    pip install dbt-orchestra
    
  2. Add a minimal config block to your dbt project's pyproject.toml:

    [tool.orchestra_dbt]
    use_stateful = true
    state_file = ".orchestra/dbt_state.json"
    local_run = true
    
  3. Bootstrap the local state file once:

    mkdir -p .orchestra
    echo '{"state":{}}' > .orchestra/dbt_state.json
    
  4. Run your normal dbt command through orc:

    orc dbt run
    

After this run, your .orchestra/dbt_state.json state file will contain freshness information, and subsequent runs will compare this information to your project's model freshness configuration for optimisation.

If you want a small demo dbt project to try this with, use tutorial/README.md.

State backends

Local JSON file (quick start)

Local JSON is the easiest way to try state-aware orchestration quickly. Keep ORCHESTRA_API_KEY unset so ORCHESTRA_STATE_FILE or state_file in pyproject.toml is used.

[tool.orchestra_dbt]
use_stateful = true
state_file = ".orchestra/dbt_state.json"
local_run = true
orc dbt run

Orchestra Cloud (managed)

Managing your dbt Core state in Orchestra requires an Orchestra API key. When ORCHESTRA_API_KEY is set, dbt-orchestra selects this backend, and ignores file-related settings. Put non-secret defaults in pyproject.toml and only export the API key:

[tool.orchestra_dbt]
use_stateful = true
local_run = true
export ORCHESTRA_API_KEY=<API_KEY>
orc dbt run

If you want to run state-aware dbt Core code without managing state files and the dbt-orchestra CLI tool, try running your dbt Core in Orchestra. Orchestra users can enable state-aware orchestration using a simple toggle.

S3 backend

To store your dbt Core state in S3, install the optional dependency (pip install 'dbt-orchestra[s3]' or uv sync --extra s3). Credentials and region follow the usual AWS SDK resolution (environment variables, shared config, IAM role, etc.). If the object does not exist yet, load starts with an empty state and save creates the object.

Daily usage

Stateful orchestration only runs for dbt build, dbt run, and dbt test. Other dbt subcommands are passed through to dbt unchanged.

Runtime behavior by command and mode

Stateful enabled dbt command Behavior
false any command orc passes through to dbt with no state load/save.
true build, run, test orc loads state, computes reusable nodes, patches clean nodes, runs dbt, updates and saves state.
true build, run, test + --full-refresh orc skips reuse decisions for this invocation, runs dbt directly, then still updates/saves state after execution.
true other command (for example seed, docs generate) orc passes through to dbt unchanged.

Configuration reference

When stateful orchestration is enabled, the CLI loads and saves dbt Core state. Enable it with use_stateful = true under [tool.orchestra_dbt], or set ORCHESTRA_USE_STATEFUL=true. That state is the same JSON shape regardless of the backend used.

Do not put secrets in pyproject.toml. Use environment variables (or your platform's secret store) for ORCHESTRA_API_KEY.

Configuration precedence

For non-secret options, if an environment variable is set, it overrides values from [tool.orchestra_dbt]; otherwise the value from pyproject.toml is used, then the built-in default. The CLI discovers pyproject.toml by walking upward from the current working directory. [tool.orchestra_dbt] is read from that file when present.

[tool.orchestra_dbt] options

Key Type Default Purpose
state_file string (optional) Local JSON path or s3://bucket/key for state (see backend table below).
use_stateful bool false Turn on stateful orchestration for supported dbt commands.
local_run bool true After reuse, revert patched files (typical for local iteration).
debug bool false Verbose logging.
integration_account_id string (optional) When set, filter state keys to this integration account prefix.

Equivalent environment overrides (when set): ORCHESTRA_USE_STATEFUL, ORCHESTRA_LOCAL_RUN, ORCHESTRA_DBT_DEBUG, ORCHESTRA_INTEGRATION_ACCOUNT_ID.

Resolving multiple backend state configurations

Priority Setting Effect
1 ORCHESTRA_API_KEY Load/save state via Orchestra HTTP. When the API key is set, ORCHESTRA_STATE_FILE and state_file in pyproject.toml are ignored for choosing the state backend.
2 ORCHESTRA_STATE_FILE Path to a JSON file, or s3://bucket/key for an object in S3. Relative file paths are resolved from the current working directory. Used only when ORCHESTRA_API_KEY is unset.
3 [tool.orchestra_dbt] / state_file in pyproject.toml Path to a JSON file, or s3://bucket/key. Relative file paths are resolved from the directory that contains the discovered pyproject.toml; absolute paths are used as-is. Used only when ORCHESTRA_API_KEY is unset and ORCHESTRA_STATE_FILE is unset.

If an effective local path or S3 URI is configured (rows 2 or 3), that file or S3 backend is used and an API key is not required for state. If ORCHESTRA_API_KEY is set (row 1), the HTTP backend is used regardless of file settings.

Warehouse adapters and implicit source freshness

Stateful reuse uses dbt source freshness results. When a source defines loaded_at_field or loaded_at_query, dbt's normal freshness logic runs on every adapter Orchestra supports through dbt Core.

When both are omitted, Orchestra can still run adapter-specific SQL to infer max_loaded_at (see src/orchestra_dbt/source_freshness/). Only the adapters below register that path today; the mapping is keyed by FreshnessRunner.adapter.type().

Warehouse dbt adapter type (typical) Implicit freshness (no loaded_at_*)
Databricks databricks Supported — uses DESCRIBE HISTORY on the source relation.
Snowflake snowflake Use loaded_at_field or loaded_at_query — no Orchestra fallback; standard dbt freshness.
Microsoft Fabric fabric Same as Snowflake — configure loaded_at_*; no Orchestra fallback.
AWS Redshift redshift Same as Snowflake — configure loaded_at_*; no Orchestra fallback.
PostgreSQL postgres Same as Snowflake — configure loaded_at_*; no Orchestra fallback.
DuckDB duckdb Not supported
Other adapters varies No Orchestra fallback unless listed above; use loaded_at_* or verify dbt's default behavior for your warehouse.

For adapters without a registered fallback, if both loaded_at settings are missing, Orchestra follows dbt's FreshnessRunner behavior (which may surface as warnings or a non-actionable result depending on dbt and the warehouse).

Example snippet

Example optional snippet in pyproject.toml:

[tool.orchestra_dbt]
use_stateful = true
state_file = ".orchestra/dbt_state.json"

Add .orchestra/ (or your chosen path) to .gitignore if the file should not be committed.

Development and contributing

For contributor guidance, see CONTRIBUTING.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_orchestra-1.0.0.tar.gz (25.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_orchestra-1.0.0-py3-none-any.whl (36.1 kB view details)

Uploaded Python 3

File details

Details for the file dbt_orchestra-1.0.0.tar.gz.

File metadata

  • Download URL: dbt_orchestra-1.0.0.tar.gz
  • Upload date:
  • Size: 25.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dbt_orchestra-1.0.0.tar.gz
Algorithm Hash digest
SHA256 e43bdb2b6897f603726eeda2df10b56e659d08c73bbfe5c1a7fea9e6eedd047a
MD5 652c35f0d096b31797a6d7d3bab65682
BLAKE2b-256 0d94f1e34966d5bdfe58ba1fe1e63263091b0a328d9b687cd3a65eb6450fbc4f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbt_orchestra-1.0.0.tar.gz:

Publisher: publish-pypi.yml on orchestra-hq/sao-paolo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dbt_orchestra-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: dbt_orchestra-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 36.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dbt_orchestra-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 30184badeb612b5dd2601ca89b3d11704a68cda47cf6536519516950f77441de
MD5 e2a599958058e3ff871827c2641e8c1b
BLAKE2b-256 ce299fa2fc61c4bf2981fe3f7340822b12138e142a61bb9b66c83f3547330e84

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbt_orchestra-1.0.0-py3-none-any.whl:

Publisher: publish-pypi.yml on orchestra-hq/sao-paolo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page