Run dbt Python models locally and materialize results back to Postgres

Project description

dbt-pybridge

dbt-pybridge runs dbt Python models on Postgres by executing Python locally or in CI, then writing results back to Postgres.

It works by:

compiling .py models through dbt
executing Python locally (developer laptop or CI runner)
loading dbt.ref() / dbt.source() data into pandas/polars
writing the returned dataframe back into Postgres

Status

MVP scope for Python table + incremental + view materializations is implemented.

Supported: materialized='table'
Supported: materialized='incremental' (strategies: append, merge, delete+insert)
Supported: materialized='view' (implemented as a managed backing table + SQL view)
Supported DAG: sql -> python -> sql
Supported return types: pandas DataFrame, polars DataFrame, or iterable/generator of dataframes

Install

pip install -e .

Use a supported Python version (3.11/3.12 recommended).

Profile

Set your profile type to pybridge:

my_profile:
  target: dev
  outputs:
    dev:
      type: pybridge
      host: localhost
      user: postgres
      password: postgres
      port: 5432
      dbname: analytics
      schema: public
      threads: 1

Example model

def model(dbt, session):
    df = dbt.ref("stg_orders")
    df["double_amount"] = df["amount"] * 2
    return df

Safer projection pattern:

def model(dbt, session):
    df = dbt.ref("stg_orders").select("order_id, amount, customer_id")
    return df

How to create Python models

Create models/<name>_python.py.
Define exactly one callable entrypoint: def model(dbt, session): ....
Set materialization inside the function:
- dbt.config(materialized="table")
Read upstream inputs using standalone ref/source assignments (important for dbt parser):
- orders = dbt.ref("stg_orders")
- raw_orders = dbt.source("raw", "orders")
Return one of:
- pandas DataFrame
- polars DataFrame
- iterable/generator that yields pandas/polars DataFrames

Parser-safe pattern:

def model(dbt, session):
    dbt.config(materialized="table")
    orders = dbt.ref("stg_orders")
    result = orders.copy()
    result["double_amount"] = result["amount"] * 2
    return result

Chunked mode:

def model(dbt, session):
    for batch in dbt.ref("stg_orders").iter_batches(batch_size=100_000):
        yield transform(batch)

Runtime logging includes progress messages such as:

[pybridge] Loading "transform"."stg_orders" (2,300,000 rows, 120.0 MB)
[pybridge] Processing batch 1, rows=100000
[pybridge] Writing batch 1, rows=100000

Runtime configs

Set model-level configs via dbt.config(...) in your python model:

pybridge_dataframe_backend: pandas (default) or polars
pybridge_max_rows: hard limit before failure (default 1_000_000)
pybridge_warn_rows: warning threshold (default 200_000)
pybridge_max_bytes: hard estimated table-size limit before failure (default 536870912, 512MB)
pybridge_warn_bytes: warning estimated table-size threshold (default 134217728, 128MB)
pybridge_allow_large_tables: bypass hard row limit (default false)
pybridge_chunked_mode: allow oversized input only when using iter_batches (default false)
pybridge_batch_size: default batch size for iter_batches (default 100_000)
pybridge_column_types: optional explicit type map for created target tables, for example:
- {"id": "numeric(18,0)", "created_at": "timestamp", "payload": "jsonb"}
pybridge_categorical_types: optional categorical-column enum type map, for example:
- {"status": "status_enum", "tier": "tier_enum"}

Legacy localpy_* keys are still accepted for backward compatibility.

Type inference details

Default inferred target types now include:

Numeric widths:
- smallint / integer / bigint / numeric (for wide unsigned integers)
- real / double precision
Temporal:
- date, time, timetz, timestamp, timestamptz, interval
Structured / special:
- uuid, bytea, jsonb
Arrays (homogeneous scalar list/tuple object columns):
- boolean[], bigint[], double precision[], text[], uuid[], date[], time[], timetz[], timestamp[], timestamptz[], numeric[]
- mixed or nested list structures fall back to jsonb

Notes:

Decimal object columns infer numeric(precision,scale) from sampled values.
Empty or ambiguous object columns fall back to text (or jsonb for ambiguous list structures).
You can always override with pybridge_column_types.

Honest limitations

Not Snowpark
Not Spark
Python runs on local machine / CI runner
Not intended for huge tables
Best for small/medium transforms
Not a replacement for warehouse-scale computation
For large tables, use filtering, incremental models, or chunked execution

First milestone command

dbt run -s customer_features

More examples

The examples/mvp_project/ directory has runnable models for each major feature:

customer_features.py — minimal pandas table model
orders_polars.py — polars backend (pybridge_dataframe_backend='polars')
daily_revenue_incremental.py — incremental + merge strategy with unique_key
orders_with_jsonb.py — pybridge_column_types overrides for jsonb, text[], and numeric(18,4)

cd examples/mvp_project
dbt run -s orders_polars
dbt run -s daily_revenue_incremental
dbt run -s daily_revenue_incremental                # second run exercises merge
dbt run -s daily_revenue_incremental --full-refresh # rebuild from scratch
dbt run -s orders_with_jsonb

Project details

Release history Release notifications | RSS feed

This version

0.1.2

May 2, 2026

0.1.1

May 2, 2026

0.1.0

May 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_pybridge-0.1.2.tar.gz (28.6 kB view details)

Uploaded May 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dbt_pybridge-0.1.2-py3-none-any.whl (23.4 kB view details)

Uploaded May 2, 2026 Python 3

File details

Details for the file dbt_pybridge-0.1.2.tar.gz.

File metadata

Download URL: dbt_pybridge-0.1.2.tar.gz
Upload date: May 2, 2026
Size: 28.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dbt_pybridge-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`216acec35262b909a2e410a5aae3b4f215e0702fca3cd37ed3326d60aa8817c6`
MD5	`b80d675449df0c5472c5e36012a93200`
BLAKE2b-256	`9167d28e7db9e5250a3913c4872edb7c6a28d2015a6d76a638219d9d2e0f2c2a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbt_pybridge-0.1.2.tar.gz:

Publisher: release.yml on kraftaa/dbt-pybridge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: dbt_pybridge-0.1.2.tar.gz
- Subject digest: 216acec35262b909a2e410a5aae3b4f215e0702fca3cd37ed3326d60aa8817c6
- Sigstore transparency entry: 1424871932
- Sigstore integration time: May 2, 2026
Source repository:
- Permalink: kraftaa/dbt-pybridge@d023d85bd0422616b2699f96a64056dab449717d
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/kraftaa
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@d023d85bd0422616b2699f96a64056dab449717d
- Trigger Event: push

File details

Details for the file dbt_pybridge-0.1.2-py3-none-any.whl.

File metadata

Download URL: dbt_pybridge-0.1.2-py3-none-any.whl
Upload date: May 2, 2026
Size: 23.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dbt_pybridge-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`79b45048df25b94a7d35281de9c851683be10cff32a72a940dd9f6727e679725`
MD5	`81924a2aee5e499ebb25cafd9005e9ae`
BLAKE2b-256	`d6200253c264ceb10174f7ba43e6a2781d2c66737d6ec7e021f8a57ae3aa91c9`

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbt_pybridge-0.1.2-py3-none-any.whl:

Publisher: release.yml on kraftaa/dbt-pybridge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: dbt_pybridge-0.1.2-py3-none-any.whl
- Subject digest: 79b45048df25b94a7d35281de9c851683be10cff32a72a940dd9f6727e679725
- Sigstore transparency entry: 1424872003
- Sigstore integration time: May 2, 2026
Source repository:
- Permalink: kraftaa/dbt-pybridge@d023d85bd0422616b2699f96a64056dab449717d
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/kraftaa
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@d023d85bd0422616b2699f96a64056dab449717d
- Trigger Event: push

dbt-pybridge 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

dbt-pybridge

Status

Install

Profile

Example model

How to create Python models

Runtime configs

Type inference details

Honest limitations

First milestone command

More examples

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance