The ClickZetta adapter plugin for dbt
Project description
dbt-clickzetta
The dbt adapter for ClickZetta Lakehouse.
查看 examples/ 目录获取各功能的完整示例。
Installation
pip install dbt-clickzetta
Requires Python 3.8+ and dbt-core 1.8+.
Quickstart
1. Configure profiles.yml
my_project:
target: dev
outputs:
dev:
type: clickzetta
service: cn-shanghai-alicloud.api.clickzetta.com
instance: your_instance
workspace: your_workspace
username: your_username
password: your_password
schema: your_schema
vcluster: default_ap
2. Test connection
dbt debug
3. Run your project
dbt run
dbt test
dbt docs generate
Supported Features
| Feature | Supported |
|---|---|
table materialization |
✅ |
view materialization |
✅ |
incremental materialization |
✅ |
ephemeral materialization |
✅ |
snapshot (SCD Type 2) |
✅ |
dynamic_table materialization |
✅ |
materialized_view materialization |
✅ |
dbt test (generic + singular) |
✅ |
dbt seed |
✅ |
dbt docs generate |
✅ (含行数、大小、最后修改时间) |
dbt source freshness |
✅ |
persist_docs (relation + columns) |
✅ |
| Partitioned tables | ✅ |
| Clustered tables | ✅ |
| Python models | ✅ |
on_schema_change |
✅ (append_new_columns, sync_all_columns) |
grants |
✅ |
clone materialization |
✅ (零拷贝克隆 + Time Travel 克隆) |
| Indexes (Bloomfilter / Inverted / Vector) | ✅ (通过 indexes config 自动创建) |
| Table Stream as source | ✅ (通过 sources.yml 声明,source() 引用) |
| VCluster per-model 切换 | ✅ (通过 vcluster config) |
Incremental Strategies
| Strategy | Description |
|---|---|
merge (default) |
MERGE INTO with unique_key |
append |
INSERT INTO without deduplication |
insert_overwrite |
INSERT OVERWRITE with dynamic partition mode |
delete+insert |
DELETE matching keys then INSERT, suitable for partition replacement without a primary key |
{{ config(
materialized='incremental',
incremental_strategy='merge',
unique_key='id'
) }}
Indexes
支持 Bloomfilter、Inverted、Vector 三种索引,建表后自动创建:
{{ config(
materialized='table',
indexes=[
{'type': 'bloomfilter', 'columns': ['order_id']},
{'type': 'inverted', 'columns': ['status'], 'analyzer': 'unicode'},
{'type': 'vector', 'columns': ['embedding'], 'distance_function': 'cosine_distance', 'scalar_type': 'f32'}
]
) }}
VCluster per-model
为单个模型指定计算集群,实现大小模型资源隔离:
{{ config(
materialized='table',
vcluster='large_ap' -- 该模型使用 large_ap 集群运行
) }}
Utility Macros
通过 dbt run-operation 调用的运维宏:
# 小文件合并(高频增量写入后使用)
dbt run-operation optimize_table --args '{relation: my_schema.my_table}'
dbt run-operation optimize_table --args '{relation: my_schema.my_table, where: "dt >= current_date() - interval 7 days"}'
# 切换 VCluster
dbt run-operation use_vcluster --args '{vcluster: large_ap}'
# 查看可恢复的已删除对象
dbt run-operation show_tables_history --args '{schema: my_schema}'
# 恢复误删对象(支持普通表、动态表、物化视图、Table Stream)
dbt run-operation undrop --args '{relation: my_schema.my_table}'
# 删除对象(type: table | view | dynamic_table | materialized_view | stream)
dbt run-operation drop_relation --args '{relation: my_schema.my_table, type: table}'
# 手动刷新动态表
dbt run-operation refresh_dynamic_table --args '{model_name: my_dynamic_table}'
Dynamic Table
{{ config(
materialized='dynamic_table',
refresh_interval='5 minutes',
refresh_vc='default_ap'
) }}
select id, name, amount
from {{ ref('orders') }}
After creation, the table is automatically refreshed once (equivalent to Snowflake's initialize=ON_CREATE). Subsequent refreshes run on the configured interval.
Snapshot
Snapshots use standard dbt SCD Type 2 via MERGE INTO on regular tables (no delta/iceberg required).
{% snapshot orders_snapshot %}
{{ config(
target_schema='snapshots',
unique_key='id',
strategy='timestamp',
updated_at='updated_at'
) }}
select * from {{ source('raw', 'orders') }}
{% endsnapshot %}
Connection Parameters
| Parameter | Required | Description |
|---|---|---|
type |
✅ | Must be clickzetta |
service |
✅ | API endpoint, e.g. cn-shanghai-alicloud.api.clickzetta.com |
instance |
✅ | Instance name |
workspace |
✅ | Workspace name |
username |
✅ | Username |
password |
✅ | Password |
schema |
✅ | Default schema |
vcluster |
✅ | VCluster name, e.g. default_ap |
connect_retries |
❌ | Connection retry count (default: 3) |
Known Limitations
| 限制 | 说明 |
|---|---|
HAVING 无 GROUP BY |
ClickZetta 支持无 GROUP BY 的 HAVING,但 SELECT 中必须包含聚合函数。SELECT 只有常量或普通列时会报错。写 dbt test 时用子查询 + WHERE 替代。 |
SHOW GRANTS 在 dbt generic test 中不可用 |
dbt generic test 会将 SQL 包裹在 select count(*) from (...) 中,而 SHOW GRANTS 不支持被这种方式包装。需用 run_query + {% if execute %} 的 singular test 方式验证权限。注意:ClickZetta 大多数 SHOW 命令支持子查询,SHOW GRANTS 是例外。 |
| 动态表不支持修改 SQL 定义 | 支持 ALTER DYNAMIC TABLE 的 suspend / resume / rename column / set comment,但不支持修改查询 SQL 或刷新间隔。需变更定义时使用 dbt run --full-refresh 重建。 |
物化视图 CREATE OR REPLACE 有限制 |
不能直接 CREATE OR REPLACE MATERIALIZED VIEW,需要特定参数组合才能使用。dbt 的处理方式是先 DROP 再 CREATE,期间视图短暂不可查询。 |
Development
# Clone
git clone https://github.com/clickzetta/dbt-clickzetta.git
cd dbt-clickzetta
# Install in editable mode
pip install -e .
# Run unit tests
pip install pytest
pytest tests/unit/
# Run functional tests (requires a real Lakehouse connection)
cp test.env.example test.env
# Fill in test.env with your connection details
pytest tests/functional/
License
Apache 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dbt_clickzetta-1.5.2.tar.gz.
File metadata
- Download URL: dbt_clickzetta-1.5.2.tar.gz
- Upload date:
- Size: 32.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
60b586bd694097357ef26c5e64bab42a6873a2c698163c09a9cb735962c80408
|
|
| MD5 |
2665e385ce0e0da17edf037faf9c3acb
|
|
| BLAKE2b-256 |
53adcd7f7bd6c442beaafa11888a8575481b4ecf75c8838d0c8da3ece1b7a18e
|
Provenance
The following attestation bundles were made for dbt_clickzetta-1.5.2.tar.gz:
Publisher:
release.yml on clickzetta/dbt-clickzetta
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dbt_clickzetta-1.5.2.tar.gz -
Subject digest:
60b586bd694097357ef26c5e64bab42a6873a2c698163c09a9cb735962c80408 - Sigstore transparency entry: 1668534825
- Sigstore integration time:
-
Permalink:
clickzetta/dbt-clickzetta@96a870a666531c474aaeeb017e4dd61936e1ad4a -
Branch / Tag:
refs/heads/main - Owner: https://github.com/clickzetta
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@96a870a666531c474aaeeb017e4dd61936e1ad4a -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file dbt_clickzetta-1.5.2-py3-none-any.whl.
File metadata
- Download URL: dbt_clickzetta-1.5.2-py3-none-any.whl
- Upload date:
- Size: 39.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
999fe3996d1d2251b33096e86866c02f98a01ae9387c3781c9b31fe5ffe9774e
|
|
| MD5 |
908875ee265e0074dc180b8e0fc9f680
|
|
| BLAKE2b-256 |
7df4b51925bf554d2f8cad5663c7bbab910b1f17968bbe1cfa6ed32e95282563
|
Provenance
The following attestation bundles were made for dbt_clickzetta-1.5.2-py3-none-any.whl:
Publisher:
release.yml on clickzetta/dbt-clickzetta
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dbt_clickzetta-1.5.2-py3-none-any.whl -
Subject digest:
999fe3996d1d2251b33096e86866c02f98a01ae9387c3781c9b31fe5ffe9774e - Sigstore transparency entry: 1668534916
- Sigstore integration time:
-
Permalink:
clickzetta/dbt-clickzetta@96a870a666531c474aaeeb017e4dd61936e1ad4a -
Branch / Tag:
refs/heads/main - Owner: https://github.com/clickzetta
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@96a870a666531c474aaeeb017e4dd61936e1ad4a -
Trigger Event:
workflow_dispatch
-
Statement type: