Skip to main content

The RisingWave adapter plugin for dbt

Project description

dbt-risingwave

A RisingWave adapter plugin for dbt.

RisingWave is a cloud-native streaming database that uses SQL as the interface language. It is designed to reduce the complexity and cost of building real-time applications. See https://www.risingwave.com.

dbt enables data analysts and engineers to transform data using software engineering workflows. For the broader RisingWave integration guide, see https://docs.risingwave.com/integrations/other/dbt.

Getting Started

  1. Install dbt-risingwave.
python3 -m pip install dbt-risingwave
  1. Get RisingWave running by following the official guide: https://www.risingwave.dev/docs/current/get-started/.

  2. Configure ~/.dbt/profiles.yml.

default:
  outputs:
    dev:
      type: risingwave
      host: 127.0.0.1
      user: root
      pass: ""
      dbname: dev
      port: 4566
      schema: public
  target: dev
  1. Run dbt debug to verify the connection.

Common Features

Detailed reference: docs/.

Schema Authorization

Use schema_authorization when dbt should create schemas with a specific owner:

{{ config(materialized='table', schema_authorization='my_role') }}

See docs/configuration.md for model-level and dbt_project.yml examples.

Streaming Parallelism

The adapter supports RisingWave session settings such as streaming_parallelism, streaming_parallelism_for_backfill, and streaming_max_parallelism in both profiles and model configs.

See docs/configuration.md for the full configuration matrix.

Serverless Backfill

Use enable_serverless_backfill=true in a model config or profile to enable serverless backfills for streaming queries.

See docs/configuration.md for examples.

Background DDL

background_ddl=true lets supported materializations submit background DDL while still preserving dbt semantics by issuing RisingWave WAIT before dbt continues.

See docs/configuration.md for supported materializations, examples, and the cluster-wide WAIT caveat.

Zero-Downtime Rebuilds

materialized_view and view support swap-based zero-downtime rebuilds through zero_downtime={'enabled': true} plus the runtime flag --vars 'zero_downtime: true'.

See docs/zero-downtime-rebuilds.md for requirements, cleanup behavior, and helper commands.

Functions

dbt-risingwave now supports a first version of dbt function resources for RisingWave scalar UDFs.

Current contract:

  • supported:
    • SQL scalar functions
    • JavaScript scalar functions via functions/*.sql plus config.language: javascript
    • external Python scalar functions via functions/*.sql plus config.language: python
      • with config.link: http://host:port
      • optional config.remote_name
      • optional config.always_retry_on_network_error
  • materialization: CREATE FUNCTION IF NOT EXISTS
  • JavaScript async options:
    • config.async: true -> WITH (async = true)
    • config.batch: true -> WITH (batch = true)
    • config.always_retry_on_network_error: true -> WITH (always_retry_on_network_error = true)
  • supported volatility config:
    • deterministic -> IMMUTABLE
    • stable -> STABLE
    • non-deterministic -> VOLATILE

Current limits:

  • no replace/update path for an existing function body
  • no overload-family management
  • no aggregate or table functions
  • no default arguments
  • upstream dbt-core function contracts do not yet map cleanly to RisingWave-native .js authoring or RisingWave external Python UDF authoring, so JavaScript and Python currently use adapter config on functions/*.sql

See docs/functions.md for the full first-version contract and example layout.

Indexes

RisingWave indexes support INCLUDE and DISTRIBUTED BY clauses beyond what the Postgres adapter exposes. Configure them in the model config:

{{ config(
    materialized='materialized_view',
    indexes=[
        {'columns': ['user_id'], 'include': ['name', 'email'], 'distributed_by': ['user_id']}
    ]
) }}

This generates:

CREATE INDEX IF NOT EXISTS "__dbt_index_mv_user_id"
  ON mv (user_id)
  INCLUDE (name, email)
  DISTRIBUTED BY (user_id);
Option Description
columns Key columns for the index (required).
include Additional columns stored in the index but not part of the key (optional).
distributed_by Columns used to distribute the index across nodes (optional).

Note: RisingWave does not support unique or type (index method) options from the Postgres adapter. These options are silently ignored.

Materializations

The adapter follows standard dbt model workflows, with RisingWave-specific materializations and behaviors.

Typical usage:

{{ config(materialized='materialized_view') }}

select *
from {{ ref('events') }}
Materialization Notes
materialized_view Creates a materialized view. This is the main streaming materialization for RisingWave.
materializedview Deprecated. Kept only for backward compatibility. Use materialized_view instead.
ephemeral Uses common table expressions under the hood.
table Creates a table from the model query.
view Creates a view from the model query.
incremental Batch-style incremental updates for tables. Prefer materialized_view when a streaming MV fits the workload.
connection Runs a full CREATE CONNECTION statement supplied by the model SQL.
source Runs a full CREATE SOURCE statement supplied by the model SQL.
table_with_connector Runs a full CREATE TABLE ... WITH (...) statement supplied by the model SQL. Supports explicit additive ALTER TABLE ADD COLUMN changes through on_schema_change='append_new_columns'.
sink Creates a sink, either from adapter configs or from a full SQL statement.

See docs/configuration.md for adapter-specific configuration examples, including streaming session settings and background DDL.

Documentation

dbt Run Behavior

  • dbt run: creates models that do not already exist.
  • dbt run --full-refresh: drops and recreates models so the deployed objects match the current dbt definitions.

Graph Operators

Graph operators are useful when you want to rebuild only part of a project.

Data Tests

dbt-risingwave extends dbt data-test failure storage to support materialized_view in addition to the upstream table and view options.

Example:

models:
  - name: my_model
    columns:
      - name: id
        tests:
          - not_null:
              config:
                store_failures: true
                store_failures_as: materialized_view

This is useful for realtime monitoring workflows where test failures should remain continuously queryable as a RisingWave materialized view.

dbt run --select "my_model+"   # select my_model and all children
dbt run --select "+my_model"   # select my_model and all parents
dbt run --select "+my_model+"  # select my_model, and all of its parents and children

Examples

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_risingwave-1.11.6.tar.gz (30.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_risingwave-1.11.6-py3-none-any.whl (37.1 kB view details)

Uploaded Python 3

File details

Details for the file dbt_risingwave-1.11.6.tar.gz.

File metadata

  • Download URL: dbt_risingwave-1.11.6.tar.gz
  • Upload date:
  • Size: 30.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for dbt_risingwave-1.11.6.tar.gz
Algorithm Hash digest
SHA256 d1b43c2f226c7537f035f68e7b6c5e33d9ecf91a910c92e808780bb5b4fe3a71
MD5 d3d3d8ee3f235907c23567e780eb34e9
BLAKE2b-256 05f7d97740a03e49329dfeb1b2dea334b9e7cef4899e626f250a65ab949cd1ab

See more details on using hashes here.

File details

Details for the file dbt_risingwave-1.11.6-py3-none-any.whl.

File metadata

File hashes

Hashes for dbt_risingwave-1.11.6-py3-none-any.whl
Algorithm Hash digest
SHA256 f0b15067c811b5ce479d548fd9769fca032d8b4261464466c26d44cd34e70ce4
MD5 59aaa595427ede078477d6940ca87133
BLAKE2b-256 f18d811adf3a365ce50a92488e2450d4383b8a72df4d3ce0765d96f99b71e51e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page