CI/CD tool for dbt projects with intelligent change detection and selective execution
Project description
dbt-ci
A CI tool for dbt (data build tool) projects that intelligently runs only modified models based on state comparison, supporting multiple execution environments including local, Docker, and dbt runners.
How It Works
dbt-ci uses a cache-based workflow:
init- Downloads reference state from cloud storage (or uses local), compares with current code, and creates a cache of changesrun/delete/ephemeral- Use the cached state automatically (no need to re-specify state paths)
This design ensures:
- ✅ Consistent state across all commands in a CI run
- ✅ Better performance (no redundant state downloads)
- ✅ Simpler CLI (specify state once in init, reuse everywhere)
Installation
From PyPI (Recommended)
pip install dbt-ci
From GitHub
# Install from main branch
pip install git+https://github.com/datablock-dev/dbt-ci.git@main
# Install a specific version
pip install git+https://github.com/datablock-dev/dbt-ci.git@v1.0.0
Local Development
git clone https://github.com/datablock-dev/dbt-ci.git
cd dbt-ci
pip install -e ".[dev]"
After installation, the tool is available as dbt-ci.
Quick Start
The Workflow: Initialize once with init, then run commands that use the cached state.
1. Initialize State
First, initialize the dbt-ci state. This downloads/reads reference state and creates a cache:
dbt-ci init \
--dbt-project-dir dbt \
--profiles-dir dbt \
--reference-target production \
--state dbt/.dbtstate
With Cloud Storage (GCS/S3):
dbt-ci init \
--dbt-project-dir dbt \
--state-uri gs://my-bucket/dbt-state/manifest.json \
--reference-target production \
--state dbt/.dbtstate
2. Run Modified Models
After initialization, run commands use the cached state automatically:
# No need to specify --state again!
dbt-ci run \
--dbt-project-dir dbt \
--profiles-dir dbt
With Docker:
dbt-ci run \
--runner docker \
--docker-image ghcr.io/dbt-labs/dbt-bigquery:latest
Commands
All commands share a set of common options (listed in the Common Options section below). Command-specific flags are listed under each command.
init - Initialize State
Creates initial state from your dbt project. Always run this first. Downloads reference manifest from cloud storage (if specified) and creates a local cache for subsequent commands.
dbt-ci init \
--dbt-project-dir dbt \
--profiles-dir dbt \
--state-uri gs://my-bucket/manifest.json \
--reference-target production \
--state dbt/.dbtstate
Flags:
| Flag | Aliases | Env Var(s) | Default | Description |
|---|---|---|---|---|
--reference-target |
--ref-target |
DBT_REFERENCE_TARGET |
None |
dbt target for the production/reference manifest |
--reference-vars |
--ref-vars |
DBT_REFERENCE_VARS |
None |
Variables to pass to dbt when compiling the reference manifest (YAML string or file path) |
--state-uri |
DBT_STATE_URI, STATE_URI |
None |
Remote URI for the state manifest (e.g. gs://bucket/manifest.json, s3://bucket/manifest.json) |
|
--target-compile |
DBT_TARGET_COMPILE |
false |
Run the second compile pass against the actual target | |
--skip-reference-compile |
DBT_SKIP_REFERENCE_COMPILE |
false |
Skip the compile pass against the reference/production state | |
--no-git |
DBT_NO_GIT |
false |
Skip git-based file change comparison | |
--comparison-strategy |
--comparison |
DBT_COMPARISON_STRATEGY |
hybrid |
Strategy for detecting changed nodes: dbt, git, or hybrid |
All common options also apply.
run - Run Modified Models
Detects and runs models that have changed. Uses cached state from init.
dbt-ci run --dbt-project-dir dbt --mode models
Flags:
| Flag | Aliases | Env Var(s) | Default | Description |
|---|---|---|---|---|
--mode |
-m, --nodes, -n |
DBT_NODES |
all |
What to run: all, models, seeds, snapshots, tests |
--filters |
-f |
None |
Extra resource-type filter (repeatable, choices: models, seeds, snapshots, tests). E.g. --mode tests -f snapshots to run only tests that have a snapshot dependency |
All common options also apply.
Examples:
# Run only modified models
dbt-ci run --mode models
# Run modified models with defer to production
dbt-ci run --mode models --defer
# Run all modified resources (models, tests, seeds, etc.)
dbt-ci run --mode all
# With Docker
dbt-ci run --runner docker --mode models
ephemeral - Ephemeral Environment
Clones changed models and their downstream dependencies into an isolated target schema using dbt clone, allowing integration testing without affecting production. Uses cached state from init.
Important:
--targetand--varsmust match the environment you want to clone into. The clone operation reads yourprofiles.ymlto determine the target database/schema — if these are wrong, models will be cloned to the wrong location or the command will fail.
dbt-ci ephemeral \
--target my-pr-env \
--vars '{"use_production_data":"false"}'
How it works:
- Reads the cached change set from
init - Builds a selection of all affected models and their downstream dependencies
- Runs
dbt clone --select <nodes>targeting the specified environment - The cloned tables/views can then be used as the base for subsequent
dbt runcommands in the PR environment
Flags:
| Flag | Aliases | Env Var(s) | Default | Description |
|---|---|---|---|---|
--keep-env |
DBT_KEEP_ENV |
false |
Don't destroy the ephemeral environment after the run (if supported by the runner) |
All common options also apply.
delete - Delete Removed Models
Detects and deletes models that have been removed from the project. Uses cached state from init.
dbt-ci delete --dry-run # preview what will be deleted
dbt-ci delete # execute deletions
Flags:
Only common options apply — no command-specific flags.
finalize - Finalize State
Run after run, delete, or ephemeral to upload artifacts and clean up the local cache for the next CI run.
dbt-ci finalize
dbt-ci finalize --artifacts-uri s3://my-bucket/dbt-artifacts/
Flags:
| Flag | Aliases | Env Var(s) | Default | Description |
|---|---|---|---|---|
--artifacts-uri |
DBT_ARTIFACTS_URI, ARTIFACTS_URI |
None |
Object storage URI for uploading run artifacts such as the updated manifest.json (e.g. s3://bucket/dbt-artifacts/) |
|
--clean-ephemeral |
--destroy-ephemeral |
DBT_CLEAN_EPHEMERAL, DBT_DESTROY_EPHEMERAL |
false |
Clean up the ephemeral environment as part of finalization |
All common options also apply.
Runners
dbt-ci supports multiple execution environments:
Local Runner
Execute dbt commands directly on your machine:
# After init
dbt-ci run \
--runner local \
--dbt-project-dir dbt
dbt Runner (Python API)
Uses dbt's Python API (fastest, default):
# After init - uses dbt Python API
dbt-ci run \
--runner dbt \
--dbt-project-dir dbt
Docker Runner
Run dbt commands inside a Docker container:
dbt-ci run \
--runner docker \
--docker-image ghcr.io/dbt-labs/dbt-duckdb:latest \
--docker-volumes $(pwd):/workspace \
--dbt-project-dir /workspace/dbt \
--state /workspace/dbt/.dbtstate
For Apple Silicon Macs:
dbt-ci run \
--runner docker \
--docker-platform linux/amd64 \
--docker-image ghcr.io/dbt-labs/dbt-postgres:latest \
--docker-volumes $(pwd):/workspace \
--dbt-project-dir /workspace/dbt
Docker Advanced Options
Platform (for Apple Silicon compatibility):
--docker-platform linux/amd64 # or linux/arm64
Custom Volumes:
--docker-volumes "/host/path:/container/path" --docker-volumes "/another:/path:ro"
Environment Variables:
--docker-env "DBT_ENV=prod" --docker-env "MY_API_KEY=secret"
Network Mode:
--docker-network bridge # or host, none, container:name
User:
--docker-user "1000:1000" # or leave empty for auto-detect
Additional Docker Args:
--docker-args "--memory=2g --cpus=2"
Complete Docker Example:
dbt-ci run \
--runner docker \
--docker-image ghcr.io/dbt-labs/dbt-postgres:1.7.0 \
--docker-platform linux/amd64 \
--docker-env "POSTGRES_HOST=host.docker.internal" \
--docker-network host \
--docker-volumes "$(pwd):/workspace" \
--docker-volumes "$HOME/.aws:/root/.aws:ro" \
--dbt-project-dir /workspace/dbt \
--profiles-dir /workspace/dbt \
--target prod
Common Options
These flags are available on every command.
Configuration File
dbt-ci supports a dbt-ci.config.yaml file as an alternative to passing every flag on the command line. It is loaded before any other options so that CLI flags and shell environment variables always take precedence.
Default location: dbt-ci.config.yaml in the current working directory (override with --config / DBT_CONFIG).
Keys in the file correspond to the environment variable names of each flag:
# dbt-ci.config.yaml
DBT_RUNNER: docker
DBT_DOCKER_IMAGE: docker.pkg.dev/my-project/dbt:latest
DBT_PROJECT_DIR: dbt
DBT_STATE: dbt/state
DBT_REFERENCE_TARGET: prod
DBT_DOCKER_VOLUMES: "$(pwd)/dbt:/dbt"
DBT_DOCKER_ENV: "DBT_PROFILES_DIR=/dbt,GOOGLE_APPLICATION_CREDENTIALS=${GOOGLE_APPLICATION_CREDENTIALS}"
Precedence (highest → lowest):
- Shell environment variables
- CLI flags
dbt-ci.config.yaml- Built-in defaults
${VAR_NAME} references inside the config file are resolved from the shell environment at load time.
Note:
dbt-ci.config.yamlis ignored by git by default (it is listed in.gitignore). Use it for local developer overrides and commit a.examplevariant for your team.
Core
| Flag | Aliases | Env Var(s) | Default | Description |
|---|---|---|---|---|
--dbt-project-dir |
DBT_PROJECT_DIR |
. |
Path to the dbt project directory | |
--profiles-dir |
DBT_PROFILES_DIR |
Auto-detect | Path to the directory containing profiles.yml |
|
--reference-state |
--state |
DBT_STATE |
None |
Local path to the reference state directory (where manifest.json is stored) |
--target |
-t |
DBT_TARGET |
From profiles.yml |
dbt target to use |
--vars |
-v |
DBT_VARS |
"" |
YAML string or path to a YAML file with dbt variables |
--defer |
DBT_DEFER |
false |
Pass dbt's --defer flag (defers unmodified nodes to the production state) |
|
--runner |
-r |
DBT_RUNNER |
dbt |
Runner to use: dbt, local, docker, bash |
--entrypoint |
DBT_ENTRYPOINT |
dbt |
Command entrypoint for dbt | |
--dbt-version |
DBT_VERSION |
Current | Pin a specific dbt version (e.g. 1.10.13) |
|
--adapter |
-a |
DBT_ADAPTER |
None |
dbt adapter to install (e.g. dbt-bigquery, dbt-duckdb=1.10.0) |
--config |
-c |
DBT_CONFIG |
dbt-ci.config.yaml |
Path to a dbt-ci YAML configuration file |
--dry-run |
DBT_DRY_RUN |
false |
Print commands without executing them | |
--quiet |
-q |
DBT_QUIET |
false |
Run in quiet mode with minimal output |
--log-level |
DBT_LOG_LEVEL |
INFO |
Logging verbosity: DEBUG, INFO, WARNING, ERROR, CRITICAL |
|
--slack-webhook |
--slack-webhook-url |
SLACK_WEBHOOK, SLACK_WEBHOOK_URL |
None |
Slack webhook URL for CI notifications |
Docker Runner
Only used when --runner docker is set.
| Flag | Env Var(s) | Default | Description |
|---|---|---|---|
--docker-image |
DBT_DOCKER_IMAGE |
ghcr.io/dbt-labs/dbt-core:latest |
Docker image to use |
--docker-platform |
DBT_DOCKER_PLATFORM |
Auto-detect | Platform override, e.g. linux/amd64 or linux/arm64 |
--docker-volumes |
DBT_DOCKER_VOLUMES |
[] |
Volume mounts (repeatable): host:container[:mode] |
--docker-env |
DBT_DOCKER_ENV |
[] |
Environment variables (repeatable): KEY=VALUE |
--docker-network |
DBT_DOCKER_NETWORK |
host |
Docker network mode |
--docker-user |
DBT_DOCKER_USER |
Auto-detect | User to run as inside the container (UID:GID) |
--docker-args |
DBT_DOCKER_ARGS |
"" |
Extra arguments appended to docker run |
Bash Runner
Only used when --runner bash is set.
| Flag | Aliases | Env Var(s) | Default | Description |
|---|---|---|---|---|
--shell-path |
--bash-path |
DBT_SHELL_PATH |
/bin/bash |
Path to the shell executable |
Cloud Storage Support
dbt-ci supports storing and retrieving state files from cloud storage (GCS, S3), making it ideal for distributed CI/CD workflows.
GCS/S3 State Storage
Store your dbt reference state in cloud storage for shared access across CI runs:
# Initialize and download state from GCS
dbt-ci init \
--dbt-project-dir dbt \
--state-uri gs://my-bucket/dbt-state/manifest.json \
--reference-target production \
--state dbt/.dbtstate
# Run using cached state (no need to specify URI again)
dbt-ci run --dbt-project-dir dbt --mode models
Benefits:
- 🔄 Shared State: Download the same reference state across different CI jobs
- 💾 Cache-Based: After init, commands use local cache (no repeated downloads)
- 📦 No Git Commits: State files don't need to be committed to version control
- 🚀 Scalable: Works seamlessly in containerized and distributed environments
- 🔐 Secure: Leverage cloud IAM and bucket policies for access control
Configuration:
The tool uses cloud credentials from your environment. Ensure your bucket is accessible:
# For GCS
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
# For AWS S3
export AWS_ACCESS_KEY_ID=your_key
export AWS_SECRET_ACCESS_KEY=your_secret
export AWS_DEFAULT_REGION=us-east-1
# Or use IAM roles (recommended in CI/CD)
dbt-ci init --state-uri gs://my-bucket/manifest.json
Supported URI Formats:
gs://bucket-name/path/to/manifest.json(Google Cloud Storage)s3://bucket-name/path/to/manifest.json(AWS S3)
Environment Variables
All CLI options can also be set via environment variables:
export DBT_PROJECT_DIR=./dbt
export DBT_PROFILES_DIR=./dbt
export DBT_TARGET=production
export DBT_RUNNER=local
# After running init, just use:
dbt-ci run
Common Environment Variables:
DBT_PROJECT_DIR- Path to dbt projectDBT_PROFILES_DIR- Path to profiles.yml locationDBT_TARGET- Target environment to useDBT_RUNNER- Runner type (local, docker, bash, dbt)
Note: State management is cache-based. Run init once, then subsequent commands automatically use the cached state.
CI/CD Integration
GitHub Actions Example
name: dbt CI
on: [pull_request]
jobs:
dbt-ci:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v2
with:
role-to-assume: arn:aws:iam::123456789012:role/GitHubActionsRole
aws-region: us-east-1
- name: Install dbt-ci
run: pip install git+https://github.com/datablock-dev/dbt-ci.git@main
- name: Initialize dbt-ci with cloud state
run: |
dbt-ci init \
--dbt-project-dir dbt \
--state-uri gs://my-dbt-state/prod/manifest.json \
--reference-target production \
--state dbt/.dbtstate
- name: Run modified models
run: |
dbt-ci run --mode models
GitLab CI Example
dbt-ci:
image: python:3.11
script:
- pip install git+https://github.com/datablock-dev/dbt-ci.git@main
- dbt-ci init --dbt-project-dir dbt --state-uri gs://my-dbt-state/prod/manifest.json --reference-target production --state dbt/.dbtstate
- dbt-ci run --mode models
only:
- merge_requests
Features
- 🎯 Smart Detection: Automatically identifies modified, new, and deleted models
- 📊 Dependency Tracking: Generates and traverses dependency graphs for lineage analysis
- 🔄 State Comparison: Compares current state against production for precise CI
- ☁️ Cloud Storage: S3 integration for shared state across distributed CI/CD workflows
- 🚀 Multiple Runners: Supports local, Docker, bash, and dbt Python API execution
- 🐳 Docker-First: Extensive Docker configuration for containerized workflows
- ⚡ Selective Execution: Run only what changed, saving time and resources
- 🔌 Adapter Support: Install specific dbt versions and adapters on-demand
- 💬 Notifications: Slack webhook integration for CI/CD alerts
- ♻️ Ephemeral Environments: Test changes in isolated environments
- 🧹 Cleanup: Automatically remove deleted models from target warehouse
Use Cases
Pull Request CI
Only build and test models affected by PR changes:
# Initialize with reference state
dbt-ci init --state-uri gs://bucket/manifest.json --reference-target production --state dbt/.dbtstate
# Run modified models with defer
dbt-ci run --mode models --defer
Distributed CI with Cloud Storage
Share state across multiple CI jobs:
# Job 1: Initialize state (downloads from cloud)
dbt-ci init --state-uri gs://my-bucket/manifest.json --reference-target production --state dbt/.dbtstate
# Job 2: Run models (uses cached state)
dbt-ci run --mode models
# Job 3: Run tests (uses cached state)
dbt-ci run --mode tests
Selective Testing
Run tests only for modified models:
# After init
dbt-ci run --mode tests
Schema Migrations
Clean up deleted models from production:
# After init
dbt-ci delete --target production
Multi-Environment Testing
Create ephemeral test environments:
dbt-ci ephemeral --keep-env
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Development Setup
- Clone the repository
- Install dependencies:
pip install -e ".[dev]" - Run tests:
pytest tests/ - Run linting:
black src/ tests/
Commit Message Format
This project uses Conventional Commits for automated releases:
feat:New feature (minor version bump)fix:Bug fix (patch version bump)docs:Documentation changesrefactor:Code refactoringtest:Adding testschore:Maintenance tasks
Example:
git commit -m "feat: add Docker runner support"
git commit -m "fix: resolve path resolution on Windows"
See RELEASING.md for details on the automated release process.
License
See LICENSE file for details.
Links
- PyPI: https://pypi.org/project/dbt-ci/
- Documentation: https://datablock.dev
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Changelog: CHANGELOG.md
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dbt_ci-1.3.3.tar.gz.
File metadata
- Download URL: dbt_ci-1.3.3.tar.gz
- Upload date:
- Size: 163.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1d1cefcbf333b1d7eef8e641f8667535399cb7e9609887d65dddc06a66d0ed7c
|
|
| MD5 |
5105066fc4332e7fb6277fd28ad837b9
|
|
| BLAKE2b-256 |
0b28748898abc1a2b1b5a88cdefcd0782a5a1ed0a7f305c58fd9b3f34489c30c
|
File details
Details for the file dbt_ci-1.3.3-py3-none-any.whl.
File metadata
- Download URL: dbt_ci-1.3.3-py3-none-any.whl
- Upload date:
- Size: 90.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
78f2016b28f3807bd4cbf98eac4220a305a3b202fed4fdb132a7762f6e6e41b2
|
|
| MD5 |
578cc2b546f030cdd71d551130555b4e
|
|
| BLAKE2b-256 |
be56e19bdbd0fe2d8061b414af56c4db034f8f317c9cc7fdb6ce0a866401f604
|