sqlbuild

Typed, test-first SQL pipelines with local E2E testing

Project description

SQLBuild

Change-aware SQL pipelines: only rebuild what changed, with all state in the warehouse.

SQLBuild is a SQL-first framework for building reliable warehouse pipelines. Every build is change-aware by default -- models, seeds, functions, and Python nodes are fingerprinted, source freshness is tracked, and unchanged work (including audits that already passed) is skipped automatically. All state is persisted as append-only tables in the warehouse alongside your data. No external state database, no manifest files, no paid add-on.

It keeps a low, dbt-like floor for SQL models and can run alongside an existing dbt project. It adds ingestion, Python nodes, providers, and opt-in virtual environments for more advanced use cases, letting you expand scope as your project naturally grows.

Key features

Change-aware builds by default -- Fingerprint-based tracking for models, seeds, functions, and Python nodes. Source freshness observation. Audits skipped when the model version hasn't changed. Cascade propagation with configurable replay windows (replay_on_change). Pass --force to override and run everything selected.
Warehouse-native state -- All change-tracking state lives in append-only tables (_sqlbuild_fingerprints, _sqlbuild_source_freshness) in your warehouse schemas. No external state database, no state machine, no corruption risk. The planner reads the latest row per identity, compares, and appends after successful builds.
Reuse from production -- Dev targets can opt into reuse_from to clone or copy unchanged relations from another target (e.g. prod) instead of rebuilding. Zero compute for models that match.
Audits that block bad data -- Audits run before data reaches the target table. For full table builds, SQLBuild materializes into a staging table and only promotes if audits pass. For incremental models, delta-phase audits validate each batch before DML.
SQL-first models with compile-time validation -- Define models as SQL files with MODEL() headers while SQLBuild resolves references, validates SQL, infers columns, checks contracts, and computes lineage before anything runs.
Cursor-based incremental processing -- Automatic gap detection and resume. If a model fails for several runs, the next build replays from where it left off. Microbatch mode splits large ranges into configurable batches.
Source freshness -- Track whether external source data has changed via adapter metadata, column queries, or custom SQL. Lag tolerance prevents jitter from triggering unnecessary rebuilds. sqb freshness observes freshness without running a build; --fail-on-stale gates CI pipelines.
Source loaders -- Load external data into source tables with Python @loader functions. Supports incremental write strategies (table, append, delete_insert, merge), cursor-based loading, and concurrent execution. Loaders run automatically during builds.
Python nodes -- Tasks (@task), assets (@asset), checks (@check), and factories (@factory) as first-class DAG nodes alongside SQL models. Identity-tracked with source and dependency diffs shown in the plan.
Providers -- Shared runtime services (API clients, connections, config) injected into Python nodes and hooks by parameter name. Backed by pydantic-settings for validation and environment variable support.
Python lifecycle hooks -- Typed sql("...")/python("hook_name") hooks with compile-time validation and a HookContext API. Provider injection supported.
Python macros, not Jinja -- Macros are real Python functions. Testable, debuggable, and composable with standard tooling.
User-defined functions -- SQL and Python UDFs managed as project resources, with table functions for predicate-pushdown-friendly alternatives to final-layer views.
Custom materializations -- Write materialization logic in Python with full framework integration, including audit hooks, schema change signals, and query change detection.
SQL unit tests that chain across models -- Mock your sources, assert on the model you care about, and SQLBuild resolves every intermediate model automatically. One test file can be a full integration test across your pipeline.
End-to-end scenarios with local replay -- Define coherent fixture worlds, run the real project graph in an isolated warehouse slice, capture JSONL snapshots, and replay them locally through DuckDB for fast CI feedback.
Data diffs -- Compare schemas and row-level data between targets with sqb diff prod:dev.
Zero-copy cloning -- Branch targets instantly with sqb clone without duplicating data. No manifest.json required.
Path-between selectors -- --select fact_orders~daily_activity_rollup selects every model on the shortest path between two nodes.
Virtual environments (alpha) -- Opt-in state-backed workflows for versioned model outputs, zero-copy branching, instant promotion and rollback, per-PR preview environments. Seeds are versioned alongside models. State stored in PostgreSQL or DuckDB, scoped per environment.
Python you can read, Rust where it counts -- The framework is Python. For SQL parsing, validation, column inference, lineage, and transpilation, SQLBuild uses Polyglot, a Rust reimplementation of SQLGlot's SQL analysis capabilities (MIT, 32+ dialects).
Dagster and Rivers integrations -- Models, loaders, tasks, assets, and checks map to Dagster/Rivers assets with dependency edges preserved. Python checks become asset checks.

Quick start

pip install sqlbuild
# or
uv pip install sqlbuild

Create and run the included playground project:

sqb playground waffle-shop
cd waffle-shop
sqb plan
sqb build
sqb test
sqb scenario test

How it works

Define your models as SQL files with MODEL() headers that declare configuration, schema, and audits inline
Compile to resolve references, validate SQL, infer column types, check contracts, and compute column lineage -- all offline
Plan what needs to change by comparing fingerprints, source freshness, seed content, and Python node identities against the warehouse state. Unchanged models, seeds, audits, and Python nodes are skipped. Production relations can optionally be reused when version identities match.
Build by executing the plan: materializing only what changed, validating data before promotion, and ensuring bad data never reaches production
Test with chained unit tests, E2E scenario tests, and local replay through DuckDB -- no warehouse required

Example

A simple staging model:

MODEL (
  materialized view,
  tags [staging],
);

SELECT
  id AS order_id,
  customer_id,
  ordered_at,
  status
FROM __source("raw__orders")

An incremental model with microbatch processing:

MODEL (
  materialized incremental,
  incremental_strategy delete_insert,
  cursor activity_hour,
  cursor_type timestamp,
  cursor_grain hour,
  cursor_inputs (
    fact_orders ordered_at,
  ),
  incremental_mode microbatch,
  batch_size 1d,
  replay_on_change full,
  tags [marts],
  post_hooks [sql('GRANT SELECT ON @@CTX:destination.qualified TO analyst_role')],
);

SELECT
  DATE_TRUNC('hour', o.ordered_at) AS activity_hour,
  COUNT(*) AS orders_placed,
  SUM(o.quantity) AS waffles_ordered
FROM __ref("fact_orders") o
GROUP BY DATE_TRUNC('hour', o.ordered_at)

A chained unit test:

TEST();

WITH
__source__raw__orders AS (
  @mock_orders()
),
__source__raw__payments AS (
  SELECT 1 AS payment_id, 1 AS order_id, 1500 AS amount_cents, 'credit_card' AS method
),
__expected__fact_orders AS (
  SELECT 1 AS order_id, 100 AS customer_id, 1500 AS total_cents,
         'credit_card' AS payment_method
),
__assert__no_negative_totals AS (
  SELECT * FROM __ref("fact_orders") WHERE total_cents < 0
)
SELECT 1

An end-to-end scenario:

SCENARIO (
  description "Customer refund updates daily revenue correctly",
  tags [revenue, refund],
);

WITH
__source__raw__orders AS (
  SELECT 1 AS order_id, DATE '2026-01-01' AS order_date, 100.00 AS amount
),
__source__raw__refunds AS (
  SELECT 1 AS refund_id, 1 AS order_id, DATE '2026-01-01' AS refund_date, 25.00 AS amount
),
__expected__daily_revenue AS (
  SELECT DATE '2026-01-01' AS order_date, 75.00 AS revenue
),
__assert__all_refunds_linked AS (
  SELECT * FROM __ref("fact_refunds") WHERE order_id IS NULL
)
SELECT 1

Scenario files live under tests/scenarios/**/*.sql. Run them in the target warehouse with:

sqb scenario test
sqb scenario test --select revenue__customer_refund --retain
sqb scenario test --select tests/scenarios/revenue --exclude revenue__slow_refund

Capture local replay snapshots as JSONL under tests/_scenario_snapshots/<scenario_name>/:

sqb scenario capture --select revenue__customer_refund
sqb scenario capture --select-file changed_scenarios.txt --exclude revenue__slow_refund
sqb scenario test --select revenue__customer_refund --local
sqb scenario test --local --sync-snapshots
sqb scenario test --local --refresh

Snapshots are committable test data. Review them for sensitive values before committing.

A source loader:

from sqlbuild.loaders import loader
from sqlbuild.executor.load.models import LoaderContext

@loader
def raw_orders(ctx: LoaderContext) -> list[dict[str, object]]:
    if ctx.current_cursor_value is None:
        return fetch_all_orders()
    return fetch_orders_since(ctx.current_cursor_value)

Python loaders should return rows for SQLBuild to write, such as a list of dictionaries or another supported tabular row object. Self-managed loaders that write their own data can return None.

Bound to a source in sources/*.yml:

sources:
  - name: raw_orders
    managed: true
    write_strategy: delete_insert
    cursor_column: ordered_at
    columns:
      - name: id
        type: INTEGER
      - name: ordered_at
        type: TIMESTAMP

Supported adapters

Adapter	Status
DuckDB	Supported
MotherDuck	Supported
Snowflake	Supported
BigQuery	Supported
Databricks	Supported
PostgreSQL	Supported
SQL Server	Supported

Documentation

Full documentation is available at docs.sqlbuild.com.

Contributing

We welcome contributions. Please see CONTRIBUTING.md for guidelines.

License

SQLBuild is licensed under the Apache License 2.0.

Project details

Release history Release notifications | RSS feed

0.41.1

Jun 26, 2026

0.41.0

Jun 26, 2026

0.40.1 yanked

Jun 24, 2026

Reason this release was yanked:

Just a shitty release

0.40.0 yanked

Jun 24, 2026

Reason this release was yanked:

shitty release

0.39.3

Jun 24, 2026

0.39.2

Jun 24, 2026

0.39.1

Jun 24, 2026

0.39.0

Jun 24, 2026

0.38.6

Jun 24, 2026

0.38.5

Jun 24, 2026

0.38.4

Jun 24, 2026

0.38.3

Jun 24, 2026

0.38.2

Jun 23, 2026

0.38.1

Jun 23, 2026

0.38.0

Jun 23, 2026

0.37.7

Jun 23, 2026

0.37.6

Jun 23, 2026

0.37.5

Jun 23, 2026

0.37.4

Jun 22, 2026

0.37.3

Jun 22, 2026

0.37.2

Jun 22, 2026

0.37.1

Jun 21, 2026

0.36.0

Jun 21, 2026

0.35.0

Jun 17, 2026

0.34.0

Jun 13, 2026

This version

0.33.0

Jun 13, 2026

0.32.0

Jun 12, 2026

0.31.0

Jun 12, 2026

0.30.1

Jun 8, 2026

0.30.0

Jun 7, 2026

0.29.0

Jun 6, 2026

0.28.1

Jun 6, 2026

0.28.0

Jun 6, 2026

0.27.0

Jun 5, 2026

0.26.2

Jun 3, 2026

0.26.1

Jun 2, 2026

0.26.0

Jun 2, 2026

0.25.1

Jun 1, 2026

0.25.0

Jun 1, 2026

0.24.0

May 28, 2026

0.23.0

May 28, 2026

0.22.1

May 28, 2026

0.22.0

May 28, 2026

0.21.0

May 28, 2026

0.20.1

May 26, 2026

0.20.0

May 26, 2026

0.19.0

May 24, 2026

0.18.0

May 23, 2026

0.16.2

May 18, 2026

0.16.1

May 17, 2026

0.15.0

May 17, 2026

0.14.0

May 16, 2026

0.13.0

May 16, 2026

0.12.0

May 15, 2026

0.10.0

May 14, 2026

0.9.0

May 14, 2026

0.8.0

May 13, 2026

0.7.0

May 7, 2026

0.4.0

May 5, 2026

0.3.0

May 5, 2026

0.2.1

May 5, 2026

0.2.0

May 5, 2026

0.0.1

Apr 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sqlbuild-0.33.0.tar.gz (3.1 MB view details)

Uploaded Jun 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sqlbuild-0.33.0-py3-none-any.whl (1.3 MB view details)

Uploaded Jun 13, 2026 Python 3

File details

Details for the file sqlbuild-0.33.0.tar.gz.

File metadata

Download URL: sqlbuild-0.33.0.tar.gz
Upload date: Jun 13, 2026
Size: 3.1 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sqlbuild-0.33.0.tar.gz
Algorithm	Hash digest
SHA256	`f432a353bb418d9168508b1ff1a2c1ad34c3b439ab6c9916322617462ab312d7`
MD5	`b8a04d6b8def3fd5dbba313ad78851e2`
BLAKE2b-256	`ca9b5be6f5e49e05f7a2c102c4d6cb5671f0c845302ba0af1bb7a2c7b8494ccb`

See more details on using hashes here.

Provenance

The following attestation bundles were made for sqlbuild-0.33.0.tar.gz:

Publisher: publish.yml on chio-labs/sqlbuild

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: sqlbuild-0.33.0.tar.gz
- Subject digest: f432a353bb418d9168508b1ff1a2c1ad34c3b439ab6c9916322617462ab312d7
- Sigstore transparency entry: 1807965465
- Sigstore integration time: Jun 13, 2026
Source repository:
- Permalink: chio-labs/sqlbuild@e2d729b88bb7777662464cfcaefc2ec1cb97d7e1
- Branch / Tag: refs/heads/main
- Owner: https://github.com/chio-labs
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@e2d729b88bb7777662464cfcaefc2ec1cb97d7e1
- Trigger Event: workflow_dispatch

File details

Details for the file sqlbuild-0.33.0-py3-none-any.whl.

File metadata

Download URL: sqlbuild-0.33.0-py3-none-any.whl
Upload date: Jun 13, 2026
Size: 1.3 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sqlbuild-0.33.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`47acef9c038780d8a84e36c32ef911b4264b7f5567826808f3ac85e0e5d52fa7`
MD5	`aea657c12e45865c91f7d4b51d298604`
BLAKE2b-256	`8d81b9b87e0d7d34c87565457c1afe0925a9dd433fca5f6276fc830a273ee938`

See more details on using hashes here.

Provenance

The following attestation bundles were made for sqlbuild-0.33.0-py3-none-any.whl:

Publisher: publish.yml on chio-labs/sqlbuild

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: sqlbuild-0.33.0-py3-none-any.whl
- Subject digest: 47acef9c038780d8a84e36c32ef911b4264b7f5567826808f3ac85e0e5d52fa7
- Sigstore transparency entry: 1807965466
- Sigstore integration time: Jun 13, 2026
Source repository:
- Permalink: chio-labs/sqlbuild@e2d729b88bb7777662464cfcaefc2ec1cb97d7e1
- Branch / Tag: refs/heads/main
- Owner: https://github.com/chio-labs
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@e2d729b88bb7777662464cfcaefc2ec1cb97d7e1
- Trigger Event: workflow_dispatch

sqlbuild 0.33.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Key features

Quick start

How it works

Example

Supported adapters

Documentation

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance