CI/CD tool for dbt projects with intelligent change detection and selective execution
Project description
dbt-ci
A CI tool for dbt (data build tool) projects that intelligently runs only modified models based on state comparison, supporting multiple execution environments including local, Docker, and dbt runners.
Installation
From PyPI (Recommended)
pip install dbt-ci
From GitHub
pip install git+https://github.com/datablock-dev/dbt-ci.git@main
Local Development
git clone https://github.com/datablock-dev/dbt-ci.git
cd dbt-ci
pip install -e ".[dev]"
After installation, the tool is available as dbt-ci.
Quick Start
1. Initialize State
First, initialize the dbt-ci state by compiling your project and creating a baseline:
dbt-ci init \
--dbt-project-dir dbt \
--profiles-dir dbt \
--production-target production
With Cloud Storage (S3):
dbt-ci init \
--dbt-project-dir dbt \
--state-uri s3://my-bucket/dbt-state/ \
--production-target production
2. Run Modified Models
After making changes to your dbt project, run only the modified models:
dbt-ci run \
--dbt-project-dir dbt \
--profiles-dir dbt \
--state dbt/.dbtstate
Or from S3:
dbt-ci run \
--dbt-project-dir dbt \
--state-uri s3://my-bucket/dbt-state/
Commands
init - Initialize State
Creates initial state from your dbt project. Always run this first.
dbt-ci init \
--dbt-project-dir dbt \
--profiles-dir dbt \
--production-target production
Options:
--production-target: Target to use for production/reference manifest (optional)--dbt-version: Specific dbt version to use (e.g.,1.10.13)--adapter,-a: Adapter to install (e.g.,dbt-duckdb=1.10.0)
run - Run Modified Models
Detects and runs models that have changed:
dbt-ci run \
--dbt-project-dir dbt \
--state dbt/.dbtstate \
--mode models
With Cloud Storage:
dbt-ci run \
--dbt-project-dir dbt \
--state-uri s3://my-bucket/dbt-state/ \
--mode models
Options:
--mode,-m: What to run:all,models,seeds,snapshots,tests(default:all)--levels: Number of dependency levels to include--defer: Use dbt's defer flag for production state
Examples:
# Run only modified models
dbt-ci run --mode models
# Run modified models with 2 levels of dependencies
dbt-ci run --mode models --levels 2
# Run all modified resources (models, tests, seeds, etc.)
dbt-ci run --mode all
# Run with cloud storage
dbt-ci run --state-uri s3://my-bucket/state/ --mode models
ephemeral - Ephemeral Environment
Creates ephemeral environments for testing without affecting production:
dbt-ci ephemeral \
--dbt-project-dir dbt \
--state dbt/.dbtstate
Options:
--keep-env: Don't destroy ephemeral environment after run
delete - Delete Removed Models
Detects and deletes models that have been removed from the project:
dbt-ci delete \
--dbt-project-dir dbt \
--state dbt/.dbtstate
Runners
dbt-ci supports multiple execution environments:
Local Runner
Execute dbt commands directly on your machine:
dbt-ci run \
--runner local \
--dbt-project-dir dbt \
--state dbt/.dbtstate
dbt Runner (Python API)
Uses dbt's Python API (fastest, default):
dbt-ci run \
--runner dbt \
--dbt-project-dir dbt \
--state dbt/.dbtstate
Docker Runner
Run dbt commands inside a Docker container:
dbt-ci run \
--runner docker \
--docker-image ghcr.io/dbt-labs/dbt-duckdb:latest \
--docker-volumes $(pwd):/workspace \
--dbt-project-dir /workspace/dbt \
--state /workspace/dbt/.dbtstate
For Apple Silicon Macs:
dbt-ci run \
--runner docker \
--docker-platform linux/amd64 \
--docker-image ghcr.io/dbt-labs/dbt-postgres:latest \
--docker-volumes $(pwd):/workspace \
--dbt-project-dir /workspace/dbt
Docker Advanced Options
Platform (for Apple Silicon compatibility):
--docker-platform linux/amd64 # or linux/arm64
Custom Volumes:
--docker-volumes "/host/path:/container/path" --docker-volumes "/another:/path:ro"
Environment Variables:
--docker-env "DBT_ENV=prod" --docker-env "MY_API_KEY=secret"
Network Mode:
--docker-network bridge # or host, none, container:name
User:
--docker-user "1000:1000" # or leave empty for auto-detect
Additional Docker Args:
--docker-args "--memory=2g --cpus=2"
Complete Docker Example:
dbt-ci run \
--runner docker \
--docker-image ghcr.io/dbt-labs/dbt-postgres:1.7.0 \
--docker-platform linux/amd64 \
--docker-env "POSTGRES_HOST=host.docker.internal" \
--docker-network host \
--docker-volumes "$(pwd):/workspace" \
--docker-volumes "$HOME/.aws:/root/.aws:ro" \
--dbt-project-dir /workspace/dbt \
--profiles-dir /workspace/dbt \
--target prod
Global Options
These options apply to all commands:
| Option | Description | Default |
|---|---|---|
--dbt-project-dir |
Path to dbt project directory | . |
--profiles-dir |
Path to profiles.yml directory | Auto-detect |
--state, --reference-state |
Path to the reference manifest.json directory | Required for run/delete |
--state-uri |
Cloud storage URI for state files (e.g., s3://bucket/path/) |
None |
--production-target |
dbt target for production/reference manifest | None |
--target, -t |
dbt target to use | From profiles.yml |
--vars, -v |
YAML string or file path with dbt variables | "" |
--defer |
Use dbt's defer flag for production state | false |
--runner, -r |
Runner type: local, docker, bash, dbt |
dbt |
--entrypoint |
Command entrypoint for dbt | dbt |
--dbt-version |
Specific dbt version to use | Current |
--adapter, -a |
Adapter to install (format: dbt-adapter=version) |
None |
--dry-run |
Print commands without executing | false |
--log-level |
Logging level: DEBUG, INFO, WARNING, ERROR, CRITICAL | INFO |
--slack-webhook |
Slack webhook URL for notifications | None |
Docker Options
| Option | Description | Default |
|---|---|---|
--docker-image |
Docker image for dbt | ghcr.io/dbt-labs/dbt-core:latest |
--docker-platform |
Platform (linux/amd64, linux/arm64) | Auto-detect |
--docker-volumes |
Volume mounts (format: host:container[:mode]) |
[] |
--docker-env |
Environment variables (format: KEY=VALUE) |
[] |
--docker-network |
Docker network mode | host |
--docker-user |
User to run as (UID:GID) | Auto-detect |
--docker-args |
Additional docker run arguments | "" |
Bash Runner Options
| Option | Description | Default |
|---|---|---|
--shell-path, --bash-path |
Path to shell executable | /bin/bash |
Cloud Storage Support
dbt-ci supports storing and retrieving state files from cloud storage, making it ideal for distributed CI/CD workflows.
S3 State Storage
Store your dbt state in S3 for shared access across CI runs:
# Initialize and upload state to S3
dbt-ci init \
--dbt-project-dir dbt \
--state-uri s3://my-bucket/dbt-state/ \
--production-target production
# Run using state from S3
dbt-ci run \
--dbt-project-dir dbt \
--state-uri s3://my-bucket/dbt-state/ \
--mode models
Benefits:
- 🔄 Shared State: Access the same state across different CI jobs and environments
- 📦 No Local Storage: State files don't need to be committed to git
- 🚀 Scalable: Works seamlessly in containerized and distributed environments
- 🔐 Secure: Leverage AWS IAM and S3 bucket policies for access control
Configuration:
The tool uses AWS credentials from your environment (AWS CLI, IAM roles, environment variables). Ensure your S3 bucket is accessible:
# AWS credentials via environment
export AWS_ACCESS_KEY_ID=your_key
export AWS_SECRET_ACCESS_KEY=your_secret
export AWS_DEFAULT_REGION=us-east-1
# Or use IAM roles (recommended in CI/CD)
dbt-ci run --state-uri s3://my-bucket/dbt-state/
Supported URI Formats:
s3://bucket-name/path/to/state/s3://bucket-name/dbt-state/
Environment Variables
All CLI options can also be set via environment variables:
export DBT_PROJECT_DIR=./dbt
export DBT_PROFILES_DIR=./dbt
export DBT_STATE=./dbt/.dbtstate
export DBT_STATE_URI=s3://my-bucket/dbt-state/
export DBT_TARGET=production
export DBT_RUNNER=local
dbt-ci run
Common Environment Variables:
DBT_STATEorSTATE_DIR- Local path to state directoryDBT_STATE_URIorSTATE_URI- Cloud storage URI for state filesDBT_PROJECT_DIR- Path to dbt projectDBT_PROFILES_DIR- Path to profiles.yml locationDBT_TARGET- Target environment to useDBT_RUNNER- Runner type (local, docker, bash, dbt)
CI/CD Integration
GitHub Actions Example
name: dbt CI
on: [pull_request]
jobs:
dbt-ci:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v2
with:
role-to-assume: arn:aws:iam::123456789012:role/GitHubActionsRole
aws-region: us-east-1
- name: Install dbt-ci
run: pip install git+https://github.com/datablock-dev/dbt-ci.git@main
- name: Initialize dbt-ci with S3 state
run: |
dbt-ci init \
--dbt-project-dir dbt \
--state-uri s3://my-dbt-state/prod/ \
--production-target production
- name: Run modified models
run: |
dbt-ci run \
--mode models \
--state-uri s3://my-dbt-state/prod/
GitLab CI Example
dbt-ci:
image: python:3.11
script:
- pip install git+https://github.com/datablock-dev/dbt-ci.git@main
- dbt-ci init --dbt-project-dir dbt --state-uri s3://my-dbt-state/prod/ --production-target production
- dbt-ci run --mode models --state-uri s3://my-dbt-state/prod/
only:
- merge_requests
Features
- 🎯 Smart Detection: Automatically identifies modified, new, and deleted models
- 📊 Dependency Tracking: Generates and traverses dependency graphs for lineage analysis
- 🔄 State Comparison: Compares current state against production for precise CI
- ☁️ Cloud Storage: S3 integration for shared state across distributed CI/CD workflows
- 🚀 Multiple Runners: Supports local, Docker, bash, and dbt Python API execution
- 🐳 Docker-First: Extensive Docker configuration for containerized workflows
- ⚡ Selective Execution: Run only what changed, saving time and resources
- 🔌 Adapter Support: Install specific dbt versions and adapters on-demand
- 💬 Notifications: Slack webhook integration for CI/CD alerts
- ♻️ Ephemeral Environments: Test changes in isolated environments
- 🧹 Cleanup: Automatically remove deleted models from target warehouse
Use Cases
Pull Request CI
Only build and test models affected by PR changes:
dbt-ci init --production-target production
dbt-ci run --mode models --defer
Distributed CI with Cloud Storage
Share state across multiple CI jobs using S3:
# Job 1: Initialize state
dbt-ci init --state-uri s3://my-bucket/dbt-state/ --production-target production
# Job 2: Run models
dbt-ci run --state-uri s3://my-bucket/dbt-state/ --mode models
# Job 3: Run tests
dbt-ci run --state-uri s3://my-bucket/dbt-state/ --mode tests
Selective Testing
Run tests only for modified models:
dbt-ci run --mode tests --state dbt/.dbtstate
Schema Migrations
Clean up deleted models from production:
dbt-ci delete --target production
Multi-Environment Testing
Create ephemeral test environments:
dbt-ci ephemeral --keep-env
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Development Setup
- Clone the repository
- Install dependencies:
pip install -e ".[dev]" - Run tests:
pytest tests/ - Run linting:
black src/ tests/
Commit Message Format
This project uses Conventional Commits for automated releases:
feat:New feature (minor version bump)fix:Bug fix (patch version bump)docs:Documentation changesrefactor:Code refactoringtest:Adding testschore:Maintenance tasks
Example:
git commit -m "feat: add Docker runner support"
git commit -m "fix: resolve path resolution on Windows"
See RELEASING.md for details on the automated release process.
License
See LICENSE file for details.
Links
- PyPI: https://pypi.org/project/dbt-ci/
- Documentation: https://datablock.dev
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Changelog: CHANGELOG.md
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dbt_ci-1.0.0.tar.gz.
File metadata
- Download URL: dbt_ci-1.0.0.tar.gz
- Upload date:
- Size: 85.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d4c0081937ccc57a3a0a1a0497c4632e56bb2abb13232276a39b0a0de34ac4af
|
|
| MD5 |
cc90926a7972efd8981765f3846c8e17
|
|
| BLAKE2b-256 |
fb9edfba267fb5364a85f2c970cb8a575394c4bd6fc0bd62729176e437ac4eb7
|
File details
Details for the file dbt_ci-1.0.0-py3-none-any.whl.
File metadata
- Download URL: dbt_ci-1.0.0-py3-none-any.whl
- Upload date:
- Size: 51.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d1a5be7e5c049f82c0c301a25bc1eb2ddc242dd08384fd7930776e4bbc7e7807
|
|
| MD5 |
10bd4a8d39aa0f3d0352a86355b13f8d
|
|
| BLAKE2b-256 |
040025833abe259ca065370b8c43300845b9499cf9ad15d831978ca2c80d7109
|