Skip to main content

A dbt-native Reverse ETL tool powered by dlt to move data between databases and APIs.

Project description

dbt-bridge

A dbt-native Reverse ETL and Cross-Database Movement tool powered by dlt.

dbt-bridge allows you to move data between databases, APIs, and warehouses directly within your dbt Python models. It leverages dlt (Data Load Tool) for robust, schema-aware data loading.

Features

  • Cross-Database Movement: Move data from Postgres to Snowflake, S3 to BigQuery, etc.
  • Reverse ETL: Push your modeled dbt data to external destinations (Salesforce, HubSpot, Postgres, etc.).
  • API Ingestion: Fetch data from APIs, transform it with Pandas in-memory, and load it to your warehouse.
  • The "Bridge Pattern": Extract -> Transform (Local) -> Load (Remote).
  • Lineage: Registers "Ghost Sources" in dbt so your lineage graph remains complete.

Installation

Install dbt-bridge in your dbt environment. You must include the extras for your specific source/destination.

# Example: Snowflake destination, Postgres source
pip install "dbt-bridge[snowflake,postgres]"

# Example: All supported extras
pip install "dbt-bridge[all]"

Supported Extras

  • Warehouses: snowflake, bigquery, redshift, databricks, synapse, fabric
  • Databases: postgres, mssql, duckdb, trino, athena
  • Storage: s3, gcs, azure, filesystem

Usage Patterns

1. Database-to-Database Transfer

Move a table from a source database (e.g., Postgres) to your destination (e.g., Snowflake).

import dbt_bridge
import dlt
from dlt.sources.sql_database import sql_database

def model(dbt, session):
    dbt.config(materialized='table', packages=['dbt-bridge', 'dlt', 'psycopg2-binary'])

    # 1. Define Source (e.g., Postgres)
    # Credentials loaded from secrets.toml
    source = sql_database(schema="public", table_names=["users"])
    
    # 2. Explicit Lineage (Important!)
    dbt.source("postgres_prod", "users")

    # 3. Define Destination (e.g., Snowflake)
    destination = dlt.destinations.snowflake()

    # 4. Transfer
    return dbt_bridge.transfer(
        dbt=dbt,
        source_data=source,
        target_destination=destination,
        dataset_name="raw_postgres",
        table_name="users_synced"
    )

2. API-to-Database (with Transformation)

Fetch data from an API, transform it with Pandas, and load it.

import dbt_bridge
import pandas as pd
from dlt.sources.helpers.rest_client import RESTClient

def model(dbt, session):
    dbt.config(materialized='table', packages=['dbt-bridge', 'dlt', 'pandas'])

    # 1. Fetch Data
    client = RESTClient(base_url="https://api.example.com")
    data_generator = client.paginate("/users")
    
    # 2. Convert to DataFrame & Transform
    # Use the helper to flatten/convert dlt generators to Pandas
    df = dbt_bridge.api_to_df(data_generator)
    df['email'] = df['email'].str.lower()

    # 3. Load
    destination = dlt.destinations.snowflake()
    
    return dbt_bridge.transfer(
        dbt=dbt,
        source_data=df,
        target_destination=destination,
        dataset_name="raw_api",
        table_name="users"
    )

3. The "Bridge Pattern" (Cross-Database Transformation)

Extract from Source A -> Model Locally (DuckDB) -> Push to Destination B.

  1. Ingest Model: Python model fetches data from Source A and returns a DataFrame. dbt saves this as a local table.
  2. Transform Model: SQL model reads the local table and applies business logic.
  3. Push Model: Python model reads the final SQL model and pushes it to Destination B.

Push Model Example:

import dbt_bridge
import dlt

def model(dbt, session):
    dbt.config(materialized='table')

    # 1. Read Final Model
    # Convert to Arrow/Pandas for dlt compatibility
    final_df = dbt.ref("final_users").arrow() 

    # 2. Push to Destination
    destination = dlt.destinations.snowflake()
    
    return dbt_bridge.transfer(
        dbt=dbt,
        source_data=final_df,
        target_destination=destination,
        dataset_name="analytics_prod",
        table_name="final_report"
    )

Configuration

dlt uses a .dlt/secrets.toml file (or environment variables) for credentials. Place this in your dbt project root.

[destination.snowflake.credentials]
database = "ANALYTICS"
password = "password"
username = "user"
host = "account_id" # Do not include .snowflakecomputing.com
warehouse = "COMPUTE_WH"

[sources.sql_database.credentials]
drivername = "postgresql"
database = "db_name"
password = "password"
username = "user"
host = "host"
port = 5432

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_bridge-0.1.1.tar.gz (7.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_bridge-0.1.1-py3-none-any.whl (6.5 kB view details)

Uploaded Python 3

File details

Details for the file dbt_bridge-0.1.1.tar.gz.

File metadata

  • Download URL: dbt_bridge-0.1.1.tar.gz
  • Upload date:
  • Size: 7.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for dbt_bridge-0.1.1.tar.gz
Algorithm Hash digest
SHA256 19338ed2de808f4448a77a63ce7fd7f6ade58953b8a1ba6aed717bd56ef41651
MD5 259a2dc31ed93207f5567bdfa8d6ff93
BLAKE2b-256 4952566094f2a70262386673c2e14eed7b7ba2480988192dba68860aa25b0394

See more details on using hashes here.

File details

Details for the file dbt_bridge-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: dbt_bridge-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 6.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for dbt_bridge-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f26e3dd54ce0ebefb360888b878d3657eeea66ffde84a0a3f2adffda7f4ae027
MD5 c9c19bbc38f15a2b9430cc61918c292d
BLAKE2b-256 cc5f1599748bdc2ddeaf159d4301eef3f9040a678852574de5f31031c39b271d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page