The OceanBase adapter plugin for dbt
Project description
dbt-oceanbase
OceanBase Database is a native distributed database, independently developed by Ant Group. It leverages the Paxos protocol and a distributed architecture to achieve high availability and linear scalability. Additionally, OceanBase is an HTAP (Hybrid Transactional and Analytical Processing) database capable of handling both online transactional and analytical workloads. It runs on common server clusters without relying on specialized hardware.
The dbt-oceanbase adapter integrates OceanBase with DBT, allowing users to create DBT projects or build data warehouses on OceanBase. Currently, this project only supports OceanBase databases in the MySQL mode. Support for the Oracle mode is not yet available.
Installation
This project is managed via PyPi, so you will need to have setuptools installed in advance. You can install dbt-oceanbase using the following shell command:
pip3 install dbt-oceanbase
Alternatively, you can install dbt-oceanbase from the source code. First, clone the repository locally, then navigate to the project directory and run:
pip3 install .
The current compatible version of dbt-core is 1.8.0b2. We aim to maintain compatibility between dbt-oceanbase and the latest versions of dbt-core. Contributions in the form of Issues and PRs are welcomed.
Quick Start
Initialization
You can initialize your dbt project using the following shell command:
dbt init ${your_project_name}
During the setup, select obmysql to create a project for development on OceanBase:
13:32:20 Running with dbt=1.8.0-b2
...
Happy modeling!
13:32:27 Setting up your profile.
Which database would you like to use?
[1] postgres
[2] obmysql
Configure Your Profile
The Profile file configures your dbt project, guiding dbt-core on how to connect to the OceanBase database and execute your dbt project. This file is typically located in the ~/.dbt directory. Below is a sample Profile configuration:
default:
outputs:
dev:
type: obmysql
threads: 1
host: 127.0.0.1
port: 2881
user: user@tenant#cluster
pass: your_password
schema: schema/database
target: dev
| Option | Description | Required | Example |
|---|---|---|---|
| type | The dbt plugin name, enumerated value. Only obmysql is accepted. |
Yes | obmysql |
| threads | Maximum number of threads to run, optional, default is 1. | No | 1 |
| host | The host address where the database is located. | Yes | 127.0.0.1 |
| port | The port number where the database is located. | Yes | 2881 |
| user | The database login username, typically composed of username, tenant, and cluster. | Yes | username@tenant#cluster |
| pass | The database login password. | Yes | - |
| schema | The default schema (Oracle mode) or database (MySQL mode) that dbt connects to. | Yes (MySQL mode) | test |
Run the Project
After initializing the project and configuring the Profile.yaml file, you can verify the project configuration with the following command:
dbt debug
If the connection to the database is successful, you can run your first dbt project with:
dbt run
During the initialization of the project, dbt-core has already created a demo project for you.
If you see the following output, it indicates that the project has been successfully configured and executed:
13:39:29 Running with dbt=1.8.0-b2
13:39:30 Registered adapter: obmysql=1.0.1
13:39:30 Found 2 models, 4 data tests, 1 snapshot, 411 macros
13:39:30
13:39:32 Concurrency: 1 threads (target='dev')
13:39:32
13:39:32 1 of 2 START sql table model test.my_first_dbt_model ......................... [RUN]
13:39:33 1 of 2 OK created sql table model test.my_first_dbt_model .................... [SELECT 2 in 0.74s]
13:39:33 2 of 2 START sql view model test.my_second_dbt_model ......................... [RUN]
13:39:33 2 of 2 OK created sql view model test.my_second_dbt_model .................... [CREATE VIEW in 0.53s]
13:39:34
13:39:34 Finished running 1 table model, 1 view model in 0 hours 0 minutes and 3.42 seconds (3.42s).
13:39:34
13:39:34 Completed successfully
13:39:34
13:39:34 Done. PASS=2 WARN=0 ERROR=0 SKIP=0 TOTAL=2
You can then check in the OceanBase database to see if a table named my_first_dbt_model and a view named my_second_dbt_model have been created.
Features
| Feature | Supported | Notes |
|---|---|---|
| Table | ✅ | - |
| View | ✅ | - |
| Incremental | ✅ | - |
| Ephemeral | ✅ | - |
| Materialized View | ✅ | - |
| Docs | ✅ | Documentation Generation |
| Seed | ✅ | Data Import |
| Snapshot | ✅ | Snapshot Creation |
Table
dbt-oceanbase provides compatibility with table objects, allowing users to maximize the use of OceanBase's table features.
| Feature | Supported | Example | Description |
|---|---|---|---|
| Columnar Storage | ✅ | {{ config(column_groups=['all columns', 'r_name(col1, col2)']) }} |
- |
| Temporary Table | ❌ | - | Temporary tables are not supported in OceanBase MySQL mode |
| Contract | ✅ | {{ config(contract={'enforced': True}) }} |
- |
| Check Constraints (Column/Table) | ✅ | constraints.type='check' |
- |
| Not Null Constraints (Column/Table) | ✅ | constraints.type='not_null' |
- |
| Unique Constraints (Column/Table) | ✅ | constraints.type='unique' |
- |
| Primary Key Constraints (Column/Table) | ✅ | constraints.type='primary_key' |
- |
| Foreign Key Constraints (Table) | ✅ | constraints.type='foreign_key' |
- |
| Table-level Comments | ✅ | models.description='this is the comment' |
- |
| Indexes | ✅ | {{ config(indexes=[{"columns": ["id"],"algorithm": "BTREE", "unique": True, "options": ['GLOBAL'], "name": "idx", "column_groups": ['all columns', 'r_name(col1, col2)']}]) }} |
- |
View
| Feature | Supported | Example | Description |
|---|---|---|---|
| Contract | ✅ | {{ config(contract={'enforced': True}) }} |
- |
| Columns | ✅ | {{ config(columns=['col1', 'col2']) }} |
- |
| Check Option | ✅ | {{ config(check_option="with check option") }} |
Options: WITH CHECK OPTION, WITH CASCADED CHECK OPTION, WITH LOCAL CHECK OPTION |
| Table-level Comments | ❌ | - | OceanBase MySQL mode does not support view comments |
Materialized View
| Feature | Supported | Example | Description |
|---|---|---|---|
| Full Refresh | ✅ | {{ config(full_refresh=True) }} |
When set to True, dbt will drop and recreate the materialized view if one with the same name already exists; if False, dbt will compare the old and new materialized views and convert the existing one to the new definition. |
| Table Options | ✅ | {{ config(table_options=["comment='this is a comment'"]) }} |
- |
| Columns | ✅ | {{ config(columns=['col1', 'col2']) }} |
- |
| Refresh Mode | ✅ | {{ config(refresh_mode="never refresh") }} |
Refresh options for the materialized view, including NEVER REFRESH, REFRESH (FAST|COMPLETE|FORCE) [ON (DEMAND|COMMIT|STATEMENT)] [[START WITH expr][NEXT expr]]. |
| Check Option | ✅ | {{ config(check_option="with check option") }} |
Options include WITH CHECK OPTION, WITH CASCADED CHECK OPTION, and WITH LOCAL CHECK OPTION. |
| Indexes | ✅ | {{ config(indexes=[{"columns": ["id"],"algorithm": "BTREE", "unique": True, "options": ['GLOBAL'], "name": "idx", "column_groups": ['all columns', 'r_name(col1, col2)']}]) }} |
- |
| On Configuration Change | ✅ | {{ config(on_configuration_change="apply") }} |
Options include apply, continue, and fail. Effective when there is an existing materialized view with the same name and full_refresh is set to False, but the existing and new definitions differ.
|
| Table-level comments | ❌ | - | OceanBase MySQL mode does not support adding comments to materialized views. |
Incremental
The incremental materialization is an extension of the table materialization, meaning that it naturally supports all features supported by the table materialization. Below are the specific properties that incremental alone supports:
| Feature | Supported | Example | Description |
|---|---|---|---|
| Unique Key | ✅ | {{ config(unique_key="id") }} |
Accepts string values only. Effective when incremental_strategy is set to delete+insert and there is an existing table with the same name. It will act as a unique key in the where condition to delete data from the existing table. |
| Full Refresh | ✅ | {{ config(full_refresh=True) }} |
When set to True, dbt will drop and recreate the table if one with the same name already exists; if False, dbt will compare the old and new table objects and modify the existing one to match the new definition. |
| On Schema Change | ✅ | {{ config(on_schema_change='sync_all_columns') }} |
Options include sync_all_columns, append_new_columns, fail, and ignore.
|
| incremental_strategy | ✅ | {{ config(incremental_strategy='delete+insert') }} |
Options include delete+insert, append, and default. append and default have the same meaning — each run appends all data to the existing table object; delete+insert: first deletes data from the existing table object and then appends all data. |
| incremental_predicates | ✅ | {{ config(incremental_predicates='a=b and c=d')}} |
Effective when incremental_strategy is set to delete+insert, helps filter data during the deletion process. |
Using the incremental materialization effectively requires certain techniques. For detailed usage, refer to the documentation.
Snapshot
Snapshot is one of the most useful features in dbt, allowing users to track changes in tables of user data. For more information on snapshot usage, refer to the official documentation.
| Feature | Supported | Example | Description |
|---|---|---|---|
| Alias | ✅ | {{ config(alias="snapshot_tbl_name") }} |
Name of the snapshot table. If not provided, dbt will generate a default name. |
| Strategy | ✅ | {{ config(strategy="check") }} |
Snapshot strategy, accepts only check and timestamp. |
| Unique Key | ✅ | {{ config(unique_key=['id']) }} |
Unique key, used to uniquely identify a row of data, typically the primary key or unique key of the table. |
| Check Cols | ✅ | {{ config(check_cols=['name']) }} |
Effective when strategy is check, specifies the set of columns to be tracked by the snapshot. Only columns in this set will be tracked. |
| Invalidate Hard Deletes | ✅ | {{ config(invalidate_hard_deletes=True) }} |
Whether to track deletion actions in the table as part of the snapshot. |
| Updated At | ✅ | {{ config(updated_at='update_time')}} |
The timestamp column, describes the column in the original table that records the modification time of the record. |
| Indexes | ✅ | {{ config(indexes=[{"columns": ["id"],"algorithm": "BTREE", "unique": True, "options": ['GLOBAL'], "name": "idx", "column_groups": ['all columns', 'r_name(col1, col2)']}]) }} |
Used to add indexes to the snapshot table. |
| Target Database | ✅ | {{ config(target_database='dbname') }} |
The database where the snapshot is located. If not provided, it defaults to the current database. |
| Target Schema | ✅ | {{ config(target_schema='dbname') }} |
The schema where the snapshot is located. Since OceanBase MySQL mode only has the concept of databases, this value should be consistent with target_database. |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dbt_oceanbase-1.0.1.tar.gz.
File metadata
- Download URL: dbt_oceanbase-1.0.1.tar.gz
- Upload date:
- Size: 26.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f6fb8a1f6b2931c55821853090c8abcfbc07d0f589a47f0237b36468505eba20
|
|
| MD5 |
b87a8f4a5febb6ee7b253bce04c15e34
|
|
| BLAKE2b-256 |
181f8b1394013cdcc2b198726bb42d82558fea85747712c09d0c7f5dca6ccf16
|
Provenance
The following attestation bundles were made for dbt_oceanbase-1.0.1.tar.gz:
Publisher:
publish.yml on oceanbase/dbt-oceanbase
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dbt_oceanbase-1.0.1.tar.gz -
Subject digest:
f6fb8a1f6b2931c55821853090c8abcfbc07d0f589a47f0237b36468505eba20 - Sigstore transparency entry: 180609439
- Sigstore integration time:
-
Permalink:
oceanbase/dbt-oceanbase@5010796846a9775517132309f139068150b24774 -
Branch / Tag:
refs/tags/v1.0.1 - Owner: https://github.com/oceanbase
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@5010796846a9775517132309f139068150b24774 -
Trigger Event:
release
-
Statement type:
File details
Details for the file dbt_oceanbase-1.0.1-py3-none-any.whl.
File metadata
- Download URL: dbt_oceanbase-1.0.1-py3-none-any.whl
- Upload date:
- Size: 29.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c39d61dbda045deb34624e73634f786108ab73da46dc9d8cdf88a24937820e5
|
|
| MD5 |
afc2553ea98c3fe20d7377a4dff3bf14
|
|
| BLAKE2b-256 |
57e7e336a2988964cb7210fb57b1b671036802a26f35cf873140c94fe3918804
|
Provenance
The following attestation bundles were made for dbt_oceanbase-1.0.1-py3-none-any.whl:
Publisher:
publish.yml on oceanbase/dbt-oceanbase
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dbt_oceanbase-1.0.1-py3-none-any.whl -
Subject digest:
2c39d61dbda045deb34624e73634f786108ab73da46dc9d8cdf88a24937820e5 - Sigstore transparency entry: 180609440
- Sigstore integration time:
-
Permalink:
oceanbase/dbt-oceanbase@5010796846a9775517132309f139068150b24774 -
Branch / Tag:
refs/tags/v1.0.1 - Owner: https://github.com/oceanbase
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@5010796846a9775517132309f139068150b24774 -
Trigger Event:
release
-
Statement type: