No project description provided

These details have not been verified by PyPI

Project description

dbt-incremental-ci

A Python library to copy production incremental models and snapshots to CI schema for dbt testing.

Overview

This library helps with dbt CI workflows by:

Identifying modified dbt models using dbt ls --select modified+ --defer
Filtering for incremental models and snapshots that exist in production
Copying those tables from production to CI schema with all data
Supporting parallel execution for faster copying

Installation

pip install dbt-incremental-ci

Usage

CLI - Local Manifest

dbt-incremental-ci \
  --prod-manifest-path path/to/prod/manifest.json \
  --dbt-project-dir path/to/dbt/project \
  --database-uri "postgresql://user:pass@host:5432/db" \
  --ci-schema "ci_schema" \
  --threads 4

CLI - dbt Cloud

# Set environment variables (recommended)
export DBT_CLOUD_API_TOKEN="your_token"
export DBT_CLOUD_ACCOUNT_ID="12345"

dbt-incremental-ci \
  --dbt-cloud-job-id "67890" \
  --dbt-project-dir path/to/dbt/project \
  --database-uri "postgresql://user:pass@host:5432/db" \
  --ci-schema "ci_schema" \
  --threads 4

# Or pass credentials directly
dbt-incremental-ci \
  --dbt-cloud-token "your_token" \
  --dbt-cloud-account-id "12345" \
  --dbt-cloud-job-id "67890" \
  --dbt-project-dir path/to/dbt/project \
  --database-uri "postgresql://user:pass@host:5432/db" \
  --ci-schema "ci_schema" \
  --threads 4

Python API - Local Manifest

from dbt_incremental_ci import DbtIncrementalCI

ci = DbtIncrementalCI(
    prod_manifest_path="path/to/prod/manifest.json",
    dbt_project_dir="path/to/dbt/project",
    database_uri="postgresql://user:pass@host:5432/db",
    ci_schema="ci_schema",
    threads=4
)

ci.run()
ci.cleanup()

Python API - dbt Cloud

from dbt_incremental_ci import DbtIncrementalCI

ci = DbtIncrementalCI(
    dbt_cloud_token="your_token",
    dbt_cloud_account_id="12345",
    dbt_cloud_job_id="67890",
    dbt_project_dir="path/to/dbt/project",
    database_uri="postgresql://user:pass@host:5432/db",
    ci_schema="ci_schema",
    threads=4
)

ci.run()
ci.cleanup()

Parameters

Manifest Source (choose one)

--prod-manifest-path: Path to production dbt manifest.json (local file)
--dbt-cloud-token: dbt Cloud API token (or set DBT_CLOUD_API_TOKEN env var)
--dbt-cloud-account-id: dbt Cloud account ID (or set DBT_CLOUD_ACCOUNT_ID env var)
--dbt-cloud-job-id: dbt Cloud job ID to fetch manifest from
--dbt-cloud-run-id: Optional specific run ID (uses latest successful if not provided)

Required

--dbt-project-dir: Path to dbt project directory
--database-uri: SQLAlchemy database URI
--ci-schema: Target CI schema name

Optional

--threads: Number of parallel threads for copying (default: 1)
--base-schema: Production base schema name (auto-detected from manifest if not provided)
--dry-run: Show what would be copied without actually copying tables
--verbose / -v: Enable verbose logging

Supported Databases

PostgreSQL
Amazon Redshift
Google BigQuery
Trino
Any database supported by SQLAlchemy

Why Use This Tool?

Traditional dbt CI runs all models from scratch, which can be slow and expensive for large projects. This tool optimizes CI by:

Only copying data for incremental models and snapshots that have changed
Leveraging existing production data instead of rebuilding everything
Running copies in parallel for maximum speed
Supporting any SQL database through SQLAlchemy

How It Works

┌─────────────────────────────────────────────────────────┐
│ 1. Detect Modified Models                               │
│    Run: dbt ls --select modified+ --defer               │
│    Output: List of changed models                       │
└─────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────┐
│ 2. Filter for Incremental/Snapshots                     │
│    Check production manifest for:                       │
│    - Incremental materialization                        │
│    - Snapshot resource type                             │
└─────────────────────────────────────────────────────────┘
                          ↓
┌─────────────────────────────────────────────────────────┐
│ 3. Copy Tables to CI Schema                             │
│    CREATE TABLE ci_schema.table                         │
│    AS SELECT * FROM prod_schema.table                   │
│    (in parallel with configurable threads)              │
└─────────────────────────────────────────────────────────┘

Dry Run Mode

Preview what tables would be copied without actually copying them:

dbt-incremental-ci \
  --prod-manifest-path prod/manifest.json \
  --dbt-project-dir . \
  --database-uri "postgresql://user:pass@host:5432/db" \
  --ci-schema "ci_schema" \
  --dry-run \
  --verbose

This will show:

Which tables would be copied
Source and target table names
Materialization type (incremental or snapshot)
SQL queries that would be executed (with --verbose)

dbt Cloud Integration

Fetch the production manifest directly from dbt Cloud API:

# Using environment variables (recommended for CI/CD)
export DBT_CLOUD_API_TOKEN="your_api_token"
export DBT_CLOUD_ACCOUNT_ID="12345"

dbt-incremental-ci \
  --dbt-cloud-job-id "67890" \
  --dbt-project-dir . \
  --database-uri "$DATABASE_URI" \
  --ci-schema "ci_pr_123" \
  --threads 4

The tool will:

Connect to dbt Cloud API
Find the latest successful run for the specified job
Download the manifest.json artifact
Use it to identify incremental models and snapshots

Getting dbt Cloud Credentials

API Token: Generate from Account Settings → API Access
Account ID: Found in your dbt Cloud URL: https://cloud.getdbt.com/deploy/{account_id}/...
Job ID: Found in the job URL or job settings

Example CI Workflows

Using Local Manifest

# Step 1: Download production manifest
aws s3 cp s3://my-bucket/prod/manifest.json ./prod_manifest.json

# Step 2: Copy incremental tables to CI
dbt-incremental-ci \
  --prod-manifest-path ./prod_manifest.json \
  --dbt-project-dir . \
  --database-uri "$DATABASE_URI" \
  --ci-schema "ci_pr_123" \
  --threads 4

# Step 3: Run dbt on CI schema
dbt build --select modified+ --schema ci_pr_123

# Step 4: Run tests
dbt test --schema ci_pr_123

Using dbt Cloud

# Step 1: Copy incremental tables from dbt Cloud production job
export DBT_CLOUD_API_TOKEN="your_token"
export DBT_CLOUD_ACCOUNT_ID="12345"

dbt-incremental-ci \
  --dbt-cloud-job-id "67890" \
  --dbt-project-dir . \
  --database-uri "$DATABASE_URI" \
  --ci-schema "ci_pr_${PR_NUMBER}" \
  --threads 4

# Step 2: Run dbt on CI schema
dbt build --select modified+ --schema "ci_pr_${PR_NUMBER}"

# Step 3: Run tests
dbt test --schema "ci_pr_${PR_NUMBER}"

Features

dbt Cloud integration - Fetch manifest directly from dbt Cloud API
Custom schema support - Automatically preserves dbt custom schema suffixes when copying to CI
Multi-database support (PostgreSQL, Redshift, BigQuery, Trino, etc.)
Parallel table copying with configurable thread count
Dry-run mode to preview changes before executing
Auto-detection of production base schema from manifest
Automatic schema creation (including custom schema suffixes)
Comprehensive error handling and logging
CLI and Python API interfaces
CI/CD ready with exit codes and detailed output

Custom Schema Handling

The tool automatically handles dbt custom schemas. If your production model uses:

models:
  my_project:
    marts:
      core:
        my_incremental_model:
          +schema: custom_suffix
          +materialized: incremental

And the production table is in prod_schema_custom_suffix, the tool will:

Auto-detect the base schema (prod_schema)
Detect the custom suffix (_custom_suffix)
Copy to CI with the same suffix: ci_schema_custom_suffix

Example:

Production: edu_dbt_incremental_models.student_history
CI target: ci_test_incremental_models.student_history

The base schema is auto-detected from the manifest, or you can specify it with --base-schema.

Requirements

Python 3.8+
dbt-core
SQLAlchemy 1.4+
Database-specific drivers (psycopg2, redshift-connector, etc.)

Development

Setting Up Development Environment

# Clone the repository
git clone https://github.com/yourusername/dbt-incremental-ci.git
cd dbt-incremental-ci

# Install Poetry
curl -sSL https://install.python-poetry.org | python3 -

# Install dependencies
poetry install

# Activate virtual environment
poetry shell

Running Tests Locally

# Test the CLI
poetry run dbt-incremental-ci --help

# Run linting
poetry run flake8 src/dbt_incremental_ci

# Build the package
poetry build

CI/CD

This project uses Poetry for dependency management and includes GitHub Actions workflows:

CI Workflow (`.github/workflows/ci.yml`)

Runs on push and pull requests to main branch
Tests package installation on Python 3.9, 3.10, 3.11, and 3.12
Lints code with flake8
Builds package artifacts with Poetry

Publish Workflow (`.github/workflows/publish.yml`)

Triggers on push to main branch
Uses dorny/paths-filter to detect changes in:
- src/dbt_incremental_ci/**/*
- poetry.lock
- pyproject.toml
Only publishes if package files have changed
Automatically publishes to PyPI using Poetry when changes are detected
Requires PYPI_TOKEN secret to be set in repository settings

Setting Up PyPI Token

Generate a PyPI API token at https://pypi.org/manage/account/token/
Add it to GitHub repository secrets as PYPI_TOKEN:
- Go to repository Settings → Secrets and variables → Actions
- Click "New repository secret"
- Name: PYPI_TOKEN
- Value: Your PyPI token

License

MIT License

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.0a15 pre-release

Apr 29, 2026

0.1.0a12 pre-release

Mar 11, 2026

0.1.0a11 pre-release

Nov 30, 2025

0.1.0a10 pre-release

Nov 30, 2025

0.1.0a9 pre-release

Nov 30, 2025

0.1.0a8 pre-release

Nov 30, 2025

0.1.0a7 pre-release

Nov 30, 2025

0.1.0a6 pre-release

Nov 30, 2025

0.1.0a5 pre-release

Nov 30, 2025

0.1.0a4 pre-release

Nov 30, 2025

0.1.0a3 pre-release

Nov 30, 2025

0.1.0a2 pre-release

Nov 30, 2025

This version

0.1.0a1 pre-release

Nov 30, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_incremental_ci-0.1.0a1.tar.gz (15.1 kB view details)

Uploaded Nov 30, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dbt_incremental_ci-0.1.0a1-py3-none-any.whl (16.3 kB view details)

Uploaded Nov 30, 2025 Python 3

File details

Details for the file dbt_incremental_ci-0.1.0a1.tar.gz.

File metadata

Download URL: dbt_incremental_ci-0.1.0a1.tar.gz
Upload date: Nov 30, 2025
Size: 15.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.2.1 CPython/3.10.19 Linux/6.11.0-1018-azure

File hashes

Hashes for dbt_incremental_ci-0.1.0a1.tar.gz
Algorithm	Hash digest
SHA256	`8d9a15e52a0d31a83f94b95331c788d779878ac30ee7e3ed5c7d7be55dd0b099`
MD5	`2c45076bd2fe021bc4383eaf46a58857`
BLAKE2b-256	`1a2ab11882d6e5207d25f0d2c40e7898bb69c46e510c473141c9e6898124c2c4`

See more details on using hashes here.

File details

Details for the file dbt_incremental_ci-0.1.0a1-py3-none-any.whl.

File metadata

Download URL: dbt_incremental_ci-0.1.0a1-py3-none-any.whl
Upload date: Nov 30, 2025
Size: 16.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.2.1 CPython/3.10.19 Linux/6.11.0-1018-azure

File hashes

Hashes for dbt_incremental_ci-0.1.0a1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`eeda8adf49cbdef8c0f7184627f6180c15f844c1bcf35fa461ff075e0dfff00f`
MD5	`b4585fee56a064db5e1ec31dbb5dcb89`
BLAKE2b-256	`27a144cd96fe0113583cbcd5a9f57846ab345aa6c079336cc5548a68739cec08`

See more details on using hashes here.

dbt-incremental-ci 0.1.0a1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

dbt-incremental-ci

Overview

Installation

Usage

CLI - Local Manifest

CLI - dbt Cloud

Python API - Local Manifest

Python API - dbt Cloud

Parameters

Manifest Source (choose one)

Required

Optional

Supported Databases

Why Use This Tool?

How It Works

Dry Run Mode

dbt Cloud Integration

Getting dbt Cloud Credentials

Example CI Workflows

Using Local Manifest

Using dbt Cloud

Features

Custom Schema Handling

Requirements

Development

Setting Up Development Environment

Running Tests Locally

CI/CD

CI Workflow (.github/workflows/ci.yml)

Publish Workflow (.github/workflows/publish.yml)

Setting Up PyPI Token

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

CI Workflow (`.github/workflows/ci.yml`)

Publish Workflow (`.github/workflows/publish.yml`)