Skip to main content

The ClickZetta adapter plugin for dbt

Project description

dbt-clickzetta

The dbt adapter for ClickZetta Lakehouse.

See the examples/ directory for complete, runnable examples of all features.

Installation

pip install dbt-clickzetta

Requires Python 3.8+ and dbt-core 1.8+.

Note on versions: The legacy dbt-clickzetta 0.2.x series on PyPI requires dbt-core ~1.5 and is no longer maintained. Use dbt-clickzetta >= 1.6 which supports dbt-core 1.8+ and includes full three-part naming (workspace.schema.table), incremental strategies, snapshots, and all modern dbt features.

Quickstart

1. Configure profiles.yml

my_project:
  target: dev
  outputs:
    dev:
      type: clickzetta
      service: cn-shanghai-alicloud.api.clickzetta.com
      instance: your_instance
      workspace: your_workspace
      username: your_username
      password: your_password
      schema: your_schema
      vcluster: default_ap

2. Test connection

dbt debug

3. Run your project

dbt run
dbt test
dbt docs generate

Supported Features

Feature Supported
table materialization
view materialization
incremental materialization
ephemeral materialization
snapshot (SCD Type 2)
dynamic_table materialization
materialized_view materialization
dbt test (generic + singular)
dbt seed
dbt docs generate ✅ (row count, size, last modified)
dbt source freshness
persist_docs (relation + columns)
Partitioned tables
Clustered tables
Python models
on_schema_change ✅ (append_new_columns, sync_all_columns)
grants
clone materialization ✅ (zero-copy clone + Time Travel clone)
Indexes (Bloomfilter / Inverted / Vector) ✅ (auto-created via indexes config)
Table Stream as source ✅ (declare in sources.yml, reference via source())
VCluster per-model ✅ (via vcluster config)

Incremental Strategies

Strategy Description
merge (default) MERGE INTO with unique_key
append INSERT INTO without deduplication
insert_overwrite INSERT OVERWRITE with dynamic partition mode
delete+insert DELETE matching keys then INSERT, suitable for partition replacement without a primary key
{{ config(
    materialized='incremental',
    incremental_strategy='merge',
    unique_key='id'
) }}

Indexes

Supports Bloomfilter, Inverted, and Vector index types. Indexes are created automatically after the table is built:

{{ config(
    materialized='table',
    indexes=[
        {'type': 'bloomfilter', 'columns': ['order_id']},
        {'type': 'inverted', 'columns': ['status'], 'analyzer': 'unicode'},
        {'type': 'vector', 'columns': ['embedding'], 'distance_function': 'cosine_distance', 'scalar_type': 'f32'}
    ]
) }}

VCluster per-model

Assign a specific VCluster to a model for compute resource isolation:

{{ config(
    materialized='table',
    vcluster='large_ap'   -- this model runs on the large_ap cluster
) }}

Utility Macros

Run via dbt run-operation:

# Compact small files (useful after high-frequency incremental writes)
dbt run-operation optimize_table --args '{relation: my_schema.my_table}'
dbt run-operation optimize_table --args '{relation: my_schema.my_table, where: "dt >= current_date() - interval 7 days"}'

# Switch VCluster for the current session
dbt run-operation use_vcluster --args '{vcluster: large_ap}'

# List recently dropped objects available for recovery
dbt run-operation show_tables_history --args '{schema: my_schema}'

# Recover a dropped object (table, dynamic table, materialized view, or stream)
dbt run-operation undrop --args '{relation: my_schema.my_table}'

# Drop an object (type: table | view | dynamic_table | materialized_view | stream)
dbt run-operation drop_object --args '{relation: my_schema.my_table, type: table}'

# Manually refresh a dynamic table
dbt run-operation refresh_dynamic_table --args '{model_name: my_dynamic_table}'

Table Stream as Source

Declare a Table Stream in sources.yml and reference it via source(). The stream appends three system columns to every row: __change_type, __commit_version, __commit_timestamp.

# sources.yml
sources:
  - name: my_streams
    schema: my_schema
    tables:
      - name: orders_stream
-- Option 1: explicit columns (production-safe)
select __change_type, __commit_timestamp, order_id, amount
from {{ source('my_streams', 'orders_stream') }}

-- Option 2: SELECT * EXCEPT — returns only user columns, no hardcoded list
select * except(__change_type, __commit_timestamp, __commit_version)
from {{ source('my_streams', 'orders_stream') }}

Note: SELECT does not advance the stream offset. Only DML statements (INSERT, MERGE, etc.) advance it.

Dynamic Table

{{ config(
    materialized='dynamic_table',
    refresh_interval='5 minutes',
    refresh_vc='default_ap'
) }}
select id, name, amount
from {{ ref('orders') }}

After creation, the table is automatically refreshed once (equivalent to Snowflake's initialize=ON_CREATE). Subsequent refreshes run on the configured interval.

Snapshot

Snapshots use standard dbt SCD Type 2 via MERGE INTO on regular tables (no delta/iceberg required).

{% snapshot orders_snapshot %}
{{ config(
    target_schema='snapshots',
    unique_key='id',
    strategy='timestamp',
    updated_at='updated_at'
) }}
select * from {{ source('raw', 'orders') }}
{% endsnapshot %}

Connection Parameters

Parameter Required Description
type Must be clickzetta
service API endpoint, e.g. cn-shanghai-alicloud.api.clickzetta.com
instance Instance name
workspace Workspace name
username Username
password Password
schema Default schema
vcluster VCluster name, e.g. default_ap
connect_retries Connection retry count (default: 3)

Development

# Clone
git clone https://github.com/clickzetta/dbt-clickzetta.git
cd dbt-clickzetta

# Install in editable mode
pip install -e .

# Run unit tests
pip install pytest
pytest tests/unit/

# Run functional tests (requires a real Lakehouse connection)
cp test.env.example test.env
# Fill in test.env with your connection details
pytest tests/functional/

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_clickzetta-1.7.2.tar.gz (33.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_clickzetta-1.7.2-py3-none-any.whl (41.6 kB view details)

Uploaded Python 3

File details

Details for the file dbt_clickzetta-1.7.2.tar.gz.

File metadata

  • Download URL: dbt_clickzetta-1.7.2.tar.gz
  • Upload date:
  • Size: 33.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dbt_clickzetta-1.7.2.tar.gz
Algorithm Hash digest
SHA256 3daa0029ec885ec1a4a262d5fe271e34af748e6ae3f3e32b900a682648478871
MD5 3e4d9dcebde5cd8a30596108f8a951ed
BLAKE2b-256 662258c654fb14cab66fecafe9545be3048f3a767fef25296ffbaa24636ff156

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbt_clickzetta-1.7.2.tar.gz:

Publisher: release.yml on clickzetta/dbt-clickzetta

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dbt_clickzetta-1.7.2-py3-none-any.whl.

File metadata

  • Download URL: dbt_clickzetta-1.7.2-py3-none-any.whl
  • Upload date:
  • Size: 41.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dbt_clickzetta-1.7.2-py3-none-any.whl
Algorithm Hash digest
SHA256 49cd856c6b9e075c5bcd90112090fc1829c8b4ca75c4fb997b309b37da9ddb16
MD5 e898621c1eeee0eea0f405eb5b0b8f1b
BLAKE2b-256 5114231a514e3efbabb1d596d3aef38eb107b4dd25467267eb242d9e7c3d2d26

See more details on using hashes here.

Provenance

The following attestation bundles were made for dbt_clickzetta-1.7.2-py3-none-any.whl:

Publisher: release.yml on clickzetta/dbt-clickzetta

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page