Skip to main content

The Starrocks adapter plugin for dbt

Project description

dbt-starrocks

PyPI PyPI - Python Version PyPI - Downloads

This project is under development.

The dbt-starrocks package contains all the code to enable dbt to work with StarRocks.

This is an experimental plugin:

  • We have not tested it extensively
  • Requires StarRocks version 2.5.0 or higher
    • version 3.1.x is recommended
    • StarRocks versions 2.4 and below are no longer supported

Installation

This plugin can be installed via pip:

$ pip install dbt-starrocks

Supported features

Starrocks <= 2.5 Starrocks 2.5 ~ 3.1 Starrocks >= 3.1 Feature
Table materialization
View materialization
Materialized View materialization
Incremental materialization
Primary Key Model
Sources
Custom data tests
Docs generate
Expression Partition
Kafka

Notice

  1. When StarRocks Version < 2.5, Create table as can only set engine='OLAP' and table_type='DUPLICATE'
  2. When StarRocks Version >= 2.5, Create table as supports table_type='PRIMARY'
  3. When StarRocks Version < 3.1 distributed_by is required

Profile Configuration

Example entry for profiles.yml:

starrocks:
  target: dev
  outputs:
    dev:
      type: starrocks
      host: localhost
      port: 9030
      schema: analytics
      username: your_starrocks_username
      password: your_starrocks_password
Option Description Required? Example
type The specific adapter to use Required starrocks
host The hostname to connect to Required 192.168.100.28
port The port to use Required 9030
schema Specify the schema (database) to build models into Required analytics
username The username to use to connect to the server Required dbt_admin
password The password to use for authenticating to the server Required correct-horse-battery-staple
version Let Plugin try to go to a compatible starrocks version Optional 3.1.0

Example

dbt seed properties(yml):

Complete configuration:

models:
  materialized: table       // table or view or materialized_view
  engine: 'OLAP'
  keys: ['id', 'name', 'some_date']
  table_type: 'PRIMARY'     // PRIMARY or DUPLICATE or UNIQUE
  distributed_by: ['id']
  buckets: 3                // default 10
  partition_by: ['some_date']
  partition_by_init: ["PARTITION p1 VALUES [('1971-01-01 00:00:00'), ('1991-01-01 00:00:00')),PARTITION p1972 VALUES [('1991-01-01 00:00:00'), ('1999-01-01 00:00:00'))"]
  // RANGE, LIST, or Expr partition types should be used in conjunction with partition_by configuration
  // Expr partition type requires an expression (e.g., date_trunc) specified in partition_by
  partition_type: 'RANGE'   // RANGE or LIST or Expr Need to be used in combination with partition_by configuration
  properties: [{"replication_num":"1", "in_memory": "true"}]
  refresh_method: 'async' // only for materialized view default manual

dbt run config:

Example configuration:

{{ config(materialized='view') }}
{{ config(materialized='table', engine='OLAP', buckets=32, distributed_by=['id']) }}
{{ config(materialized='table', partition_by=['date_trunc("day", first_order)'], partition_type='Expr') }}
{{ config(materialized='incremental', table_type='PRIMARY', engine='OLAP', buckets=32, distributed_by=['id']) }}
{{ config(materialized='materialized_view') }}
{{ config(materialized='materialized_view', properties={"storage_medium":"SSD"}) }}
{{ config(materialized='materialized_view', refresh_method="ASYNC START('2022-09-01 10:00:00') EVERY (interval 1 day)") }}

For materialized view only support partition_by、buckets、distributed_by、properties、refresh_method configuration.

Read From Catalog

First you need to add this catalog to starrocks. The following is an example of hive.

CREATE EXTERNAL CATALOG `hive_catalog`
PROPERTIES (
    "hive.metastore.uris"  =  "thrift://127.0.0.1:8087",
    "type"="hive"
);

How to add other types of catalogs can be found in the documentation. https://docs.starrocks.io/en-us/latest/data_source/catalog/catalog_overview Then write the sources.yaml file.

sources:
  - name: external_example
    schema: hive_catalog.hive_db
    tables:
      - name: hive_table_name

Finally, you might use below marco quote

{{ source('external_example', 'hive_table_name') }}

Test Adapter

consult the project

Contributing

We welcome you to contribute to dbt-starrocks. Please see the Contributing Guide for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt-starrocks-1.4.1.tar.gz (17.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_starrocks-1.4.1-py3-none-any.whl (29.2 kB view details)

Uploaded Python 3

File details

Details for the file dbt-starrocks-1.4.1.tar.gz.

File metadata

  • Download URL: dbt-starrocks-1.4.1.tar.gz
  • Upload date:
  • Size: 17.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for dbt-starrocks-1.4.1.tar.gz
Algorithm Hash digest
SHA256 87ed792b0d6a27f7e4e6dcbf774d626309b9594ea6338a72e2efdc5296cec628
MD5 47c4847536beced45901652f15d8aceb
BLAKE2b-256 8ee25ca605c878d5d61045df872b7162b9f22bfd9ce14325e0c1241d4bf9a4cc

See more details on using hashes here.

File details

Details for the file dbt_starrocks-1.4.1-py3-none-any.whl.

File metadata

  • Download URL: dbt_starrocks-1.4.1-py3-none-any.whl
  • Upload date:
  • Size: 29.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for dbt_starrocks-1.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 60beb28a6c86ad8e53e97ca91ab0205d621a98a9dfefdc662735b21dbb0abda7
MD5 ccafeec2aa26fc641e31b48544f2474d
BLAKE2b-256 58645fbd0a697ff8e48c3f66ded2494b81c49ecdd3d80efd0635f3d58c4c78a8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page