Skip to main content

The RisingWave adapter plugin for dbt

Project description

dbt-risingwave

A RisingWave adapter plugin for dbt.

RisingWave is a cloud-native streaming database that uses SQL as the interface language. It is designed to reduce the complexity and cost of building real-time applications. See https://www.risingwave.com.

dbt enables data analysts and engineers to transform data using software engineering workflows. For the broader RisingWave integration guide, see https://docs.risingwave.com/integrations/other/dbt.

Getting Started

  1. Install dbt-risingwave.
python3 -m pip install dbt-risingwave
  1. Get RisingWave running by following the official guide: https://www.risingwave.dev/docs/current/get-started/.

  2. Configure ~/.dbt/profiles.yml.

default:
  outputs:
    dev:
      type: risingwave
      host: 127.0.0.1
      user: root
      pass: ""
      dbname: dev
      port: 4566
      schema: public
  target: dev
  1. Run dbt debug to verify the connection.

Common Features

Detailed reference: docs/.

Schema Authorization

Use schema_authorization when dbt should create schemas with a specific owner:

{{ config(materialized='table', schema_authorization='my_role') }}

See docs/configuration.md for model-level and dbt_project.yml examples.

Streaming Parallelism

The adapter supports RisingWave session settings such as streaming_parallelism, streaming_parallelism_for_backfill, and streaming_max_parallelism in both profiles and model configs.

See docs/configuration.md for the full configuration matrix.

Serverless Backfill

Use enable_serverless_backfill=true in a model config or profile to enable serverless backfills for streaming queries.

See docs/configuration.md for examples.

Background DDL

background_ddl=true lets supported materializations submit background DDL while still preserving dbt semantics by issuing RisingWave WAIT before dbt continues.

See docs/configuration.md for supported materializations, examples, and the cluster-wide WAIT caveat.

Zero-Downtime Rebuilds

materialized_view and view support swap-based zero-downtime rebuilds through zero_downtime={'enabled': true} plus the runtime flag --vars 'zero_downtime: true'.

See docs/zero-downtime-rebuilds.md for requirements, cleanup behavior, and helper commands.

Functions

dbt-risingwave now supports a first version of dbt function resources for RisingWave scalar UDFs.

Current contract:

  • supported:
    • SQL scalar functions
    • JavaScript scalar functions via functions/*.sql plus config.language: javascript
    • external Python scalar functions via functions/*.sql plus config.language: python
      • with config.link: http://host:port
      • optional config.remote_name
      • optional config.always_retry_on_network_error
  • materialization: CREATE FUNCTION IF NOT EXISTS
  • JavaScript async options:
    • config.async: true -> WITH (async = true)
    • config.batch: true -> WITH (batch = true)
    • config.always_retry_on_network_error: true -> WITH (always_retry_on_network_error = true)
  • supported volatility config:
    • deterministic -> IMMUTABLE
    • stable -> STABLE
    • non-deterministic -> VOLATILE

Current limits:

  • no replace/update path for an existing function body
  • no overload-family management
  • no aggregate or table functions
  • no default arguments
  • upstream dbt-core function contracts do not yet map cleanly to RisingWave-native .js authoring or RisingWave external Python UDF authoring, so JavaScript and Python currently use adapter config on functions/*.sql

See docs/functions.md for the full first-version contract and example layout.

Indexes

RisingWave indexes support INCLUDE and DISTRIBUTED BY clauses beyond what the Postgres adapter exposes. Configure them in the model config:

{{ config(
    materialized='materialized_view',
    indexes=[
        {'columns': ['user_id'], 'include': ['name', 'email'], 'distributed_by': ['user_id']}
    ]
) }}

This generates:

CREATE INDEX IF NOT EXISTS "__dbt_index_mv_user_id"
  ON mv (user_id)
  INCLUDE (name, email)
  DISTRIBUTED BY (user_id);
Option Description
columns Key columns for the index (required).
include Additional columns stored in the index but not part of the key (optional).
distributed_by Columns used to distribute the index across nodes (optional).

Note: RisingWave does not support unique or type (index method) options from the Postgres adapter. These options are silently ignored.

Materializations

The adapter follows standard dbt model workflows, with RisingWave-specific materializations and behaviors.

Typical usage:

{{ config(materialized='materialized_view') }}

select *
from {{ ref('events') }}
Materialization Notes
materialized_view Creates a materialized view. This is the main streaming materialization for RisingWave.
materializedview Deprecated. Kept only for backward compatibility. Use materialized_view instead.
ephemeral Uses common table expressions under the hood.
table Creates a table from the model query.
view Creates a view from the model query.
incremental Batch-style incremental updates for tables. Prefer materialized_view when a streaming MV fits the workload.
connection Runs a full CREATE CONNECTION statement supplied by the model SQL.
source Runs a full CREATE SOURCE statement supplied by the model SQL.
table_with_connector Runs a full CREATE TABLE ... WITH (...) statement supplied by the model SQL.
sink Creates a sink, either from adapter configs or from a full SQL statement.

See docs/configuration.md for adapter-specific configuration examples, including streaming session settings and background DDL.

Documentation

dbt Run Behavior

  • dbt run: creates models that do not already exist.
  • dbt run --full-refresh: drops and recreates models so the deployed objects match the current dbt definitions.

Graph Operators

Graph operators are useful when you want to rebuild only part of a project.

Data Tests

dbt-risingwave extends dbt data-test failure storage to support materialized_view in addition to the upstream table and view options.

Example:

models:
  - name: my_model
    columns:
      - name: id
        tests:
          - not_null:
              config:
                store_failures: true
                store_failures_as: materialized_view

This is useful for realtime monitoring workflows where test failures should remain continuously queryable as a RisingWave materialized view.

dbt run --select "my_model+"   # select my_model and all children
dbt run --select "+my_model"   # select my_model and all parents
dbt run --select "+my_model+"  # select my_model, and all of its parents and children

Examples

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_risingwave-1.11.5.tar.gz (28.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_risingwave-1.11.5-py3-none-any.whl (35.2 kB view details)

Uploaded Python 3

File details

Details for the file dbt_risingwave-1.11.5.tar.gz.

File metadata

  • Download URL: dbt_risingwave-1.11.5.tar.gz
  • Upload date:
  • Size: 28.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for dbt_risingwave-1.11.5.tar.gz
Algorithm Hash digest
SHA256 f5306ecc612b2a6b0035cc0d64b9478942d7c6bfe7c0df7f6d48bde5c2a4eee1
MD5 a4832affa58bcbf4a9750ece7f6bcb0b
BLAKE2b-256 25802f77fa9d2007dc0c93234f04d7f75d4b1115f5781cf3d92bce8fe7d2dae0

See more details on using hashes here.

File details

Details for the file dbt_risingwave-1.11.5-py3-none-any.whl.

File metadata

File hashes

Hashes for dbt_risingwave-1.11.5-py3-none-any.whl
Algorithm Hash digest
SHA256 4df716f7d8588056d974a459e340e6325c10f324e81dd914114a520cea097e9e
MD5 0bcbd12c0177d3486b31080f9fb9e8e3
BLAKE2b-256 7d42db04f4f8ca947a4cffc686133326dee3b2a858b5fc60a41ced041d7eb148

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page