Skip to main content

The dbt adapter for Feldera — DBSP-native incremental view maintenance

Project description

dbt-feldera

The dbt adapter for Feldera.

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

Feldera is a streaming SQL engine powered by the DBSP incremental computation engine. It automatically incrementalizes every SQL query without watermarks, scans, or MERGE. When input data changes, only affected output rows are recomputed.

[!IMPORTANT] This adapter deploys continuous pipelines, not ad-hoc queries.

Feldera supports two modes of query execution:

  • Continuous pipelines compile SQL into an incremental dataflow that runs indefinitely, processing every input change as it arrives in near real-time.
  • Ad-hoc queries are one-shot batch queries executed by DataFusion against the state of materialized tables and views. They exist primarily for development and debugging.

When you run dbt run, this adapter assembles your models into a Feldera pipeline program, compiles it, and starts a continuously running pipeline. The pipeline keeps processing input changes and updating outputs until it is explicitly stopped. This differs from typical batch-oriented dbt adapters where dbt run executes a query once, processes a batch of data and exits.

Key features

  • Automatic incremental view maintenance (IVM) — Feldera's DBSP engine incrementalizes any SQL query out of the box. No manual merge logic or watermark tuning required.
  • Continuous pipeline deploymentdbt run compiles and starts a long-running Feldera pipeline; it does not execute one-shot queries.
  • Connector integration — attach Kafka, Delta Lake, S3, and HTTP connectors directly to models via configuration.
  • Easy setup — pure Python adapter with no ODBC driver needed.

Installation

pip install dbt-feldera

or with uv:

uv add dbt-feldera

Requires Python 3.10+ and dbt-core ~1.9.

Configuration

Add a Feldera target to your profiles.yml:

my_project:
  target: dev
  outputs:
    dev:
      type: feldera
      host: "http://localhost:8080"
      api_key: "apikey:..."          # optional — for authenticated instances
      database: "default"
      schema: "my_pipeline"          # maps to the Feldera pipeline name
      compilation_profile: dev       # dev | unoptimized | optimized
      workers: 4
      timeout: 300

Concept mapping

Feldera uses different terminology than traditional databases. Here's how dbt concepts map to Feldera. Every materialization contributes SQL to a continuously running pipeline — nothing is executed as a one-shot batch query.

dbt concept Feldera concept Description
database (unused) Set to any string (e.g. "default")
schema Pipeline name Each dbt schema maps to one Feldera pipeline (a continuously running SQL program)
table materialization Input table External data source (Kafka, HTTP, S3)
view materialization View SQL view inside the continuous pipeline (all views are incrementally maintained)
view + stored: true Materialized view Queryable via ad-hoc queries
seed Table + HTTP push Schema registered, data pushed via HTTP ingress

Configuration options

Option Default Description
host http://localhost:8080 Feldera API base URL
api_key (none) API key for authenticated Feldera instances
schema (required) Pipeline name in Feldera
compilation_profile dev SQL compilation profile: dev (fast compile), unoptimized, or optimized (best runtime performance)
workers 4 Number of pipeline worker threads
timeout 300 Max wait (seconds) for pipeline compilation + startup

Materializations

view — Intermediate transform / Materialized view

Creates a CREATE VIEW in the pipeline. Use for intermediate transformations that don't need to be queried directly or connected to an output.

-- models/orders_enriched.sql
{{ config(materialized='view') }}

SELECT o.id, o.total, c.name AS customer_name
FROM {{ ref('orders') }} o
JOIN {{ ref('customers') }} c ON o.customer_id = c.id

Set stored: true to promote to a CREATE MATERIALIZED VIEW — a view backed by persistent storage, enabling ad-hoc queries:

-- models/sales_summary.sql
{{ config(
    materialized='view',
    stored=true,
    connectors=[{'transport': {'name': 'my_delta_connector'}}]
) }}

SELECT
    region,
    product_category,
    SUM(amount) AS total_sales,
    COUNT(*) AS order_count
FROM {{ ref('orders') }}
GROUP BY region, product_category

[!NOTE] Every view in Feldera is automatically incrementally maintained by the DBSP engine. When inputs change, only affected output rows are recomputed — no watermarks, merge logic, or special configuration required. The stored flag controls only whether the view's state is queryable (via ad-hoc queries); it does not change how the view is computed.

On --full-refresh, the pipeline is stopped, all stored state (including connector offsets) is cleared, and the pipeline is redeployed from scratch.

table — Input source

Creates a CREATE TABLE — an input source for external data ingress. The model SQL defines the column schema, not a SELECT query. Attach connectors for Kafka, S3, HTTP, or other input sources.

-- models/raw_events.sql
{{ config(
    materialized='table',
    connectors=[{
        'transport': {
            'name': 'kafka_in',
            'config': {
                'bootstrap.servers': 'redpanda:29092',
                'topics': ['events']
            }
        },
        'format': {'name': 'json'}
    }]
) }}

event_id BIGINT NOT NULL,
event_type VARCHAR NOT NULL,
payload VARCHAR,
created_at TIMESTAMP NOT NULL

incremental — Unsupported

[!IMPORTANT] dbt-feldera does not support the incremental materialization because all views in Feldera are natively maintained incrementally by the DBSP engine.

Use materialized='view' with stored=true instead:

{{ config(materialized='view', stored=true) }}

streaming_pipeline — Full pipeline as a single model

Deploys an entire Feldera pipeline as one dbt model. The model SQL is the complete pipeline program — containing CREATE TABLE and CREATE VIEW statements. Useful for complex multi-table, multi-view pipelines managed as a single unit.

-- models/my_pipeline.sql
{{ config(materialized='streaming_pipeline') }}

CREATE TABLE orders (
    id BIGINT NOT NULL,
    customer_id BIGINT NOT NULL,
    amount DECIMAL(10, 2) NOT NULL
);

CREATE TABLE customers (
    id BIGINT NOT NULL,
    name VARCHAR NOT NULL
);

CREATE MATERIALIZED VIEW enriched_orders AS
SELECT o.id, o.amount, c.name AS customer_name
FROM orders o
JOIN customers c ON o.customer_id = c.id;

seed — Reference data via HTTP push

Seeds register a CREATE TABLE and push row data via Feldera's HTTP ingress API after the pipeline is deployed. Use for small reference datasets (CSVs).

dbt seed                # push seed data
dbt seed --full-refresh # stop, clear storage, redeploy, then push

Summary

Materialization Feldera SQL Best for
view CREATE VIEW Incrementally maintained intermediate transforms
view + stored: true CREATE MATERIALIZED VIEW Queryable outputs
table CREATE TABLE External input sources (Kafka, S3, HTTP)
streaming_pipeline Full program Multi-table/view pipelines as a single unit
seed CREATE TABLE + data push Small reference datasets (HTTP ingress; any connector can also be attached)

Documentation

Contributing

See CONTRIBUTING.md for development setup, testing, and project layout.

License

Apache-2.0 — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_feldera-0.291.0.tar.gz (35.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_feldera-0.291.0-py3-none-any.whl (41.3 kB view details)

Uploaded Python 3

File details

Details for the file dbt_feldera-0.291.0.tar.gz.

File metadata

  • Download URL: dbt_feldera-0.291.0.tar.gz
  • Upload date:
  • Size: 35.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dbt_feldera-0.291.0.tar.gz
Algorithm Hash digest
SHA256 b7e4fb5cfb2dcd835c82e7ffc63cb0553d00b0d6d846c9ffc28b121c98169ab3
MD5 dee3fed3c5669dc64cdc753556c34e61
BLAKE2b-256 2ca9e28e2235b33d07fae6574c3cb702f3dc734043bc7f087f4dfa0e6c7b6477

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbt_feldera-0.291.0.tar.gz:

Publisher: ci-post-release.yml on feldera/feldera

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dbt_feldera-0.291.0-py3-none-any.whl.

File metadata

  • Download URL: dbt_feldera-0.291.0-py3-none-any.whl
  • Upload date:
  • Size: 41.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dbt_feldera-0.291.0-py3-none-any.whl
Algorithm Hash digest
SHA256 24508c64b48a7e5749206919409ddc768c30d61c1fe6875b342d38a299a0c399
MD5 f857e9cf83cb0837918bd28394ba3a13
BLAKE2b-256 d5d2985673c224b8f0b6177b6b8c527531d9276edbb8e2431a6b40e24a6f63c3

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbt_feldera-0.291.0-py3-none-any.whl:

Publisher: ci-post-release.yml on feldera/feldera

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page