Skip to main content

The Hologres adapter plugin for dbt

Project description

dbt-hologres

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

dbt is the T in ELT. Organize, cleanse, denormalize, filter, rename, and pre-aggregate the raw data in your warehouse so that it's ready for analysis.

dbt-hologres

dbt-hologres enables dbt to work with Alibaba Cloud Hologres, a real-time data warehouse compatible with PostgreSQL.

For more information on using dbt with Hologres, consult the dbt documentation.

Getting started

Installation

Install from PyPI

pip install dbt-alibaba-cloud-hologres

Install from Source

For development or to get the latest features, you can install directly from the source code:

# Clone the repository
git clone https://github.com/aliyun/dbt-hologres.git
cd dbt-hologres

# Install in editable mode
pip install --force-reinstall -e .

This allows you to:

  • Modify the adapter code and see changes immediately
  • Contribute to the project development
  • Test unreleased features

Configuration

Configure your profiles.yml file:

hologres_project:
  target: dev
  outputs:
    dev:
      type: hologres
      host: hgxxx-xx-xxx-xx-xxx.hologres.aliyuncs.com
      port: 80
      user: BASIC$your_username
      password: your_password
      database: your_database
      schema: ""  # Use empty string if no default schema needed
      threads: 4
      connect_timeout: 10
      sslmode: disable

Product Features

Core Adapter Features

Feature Description
HologresAdapter Core adapter with PostgreSQL-compatible syntax support
Psycopg3 Driver Uses modern Psycopg 3 library for better performance
Incremental Strategies Multiple strategies: append, delete+insert, merge, microbatch
Constraints Full support for primary key, not null, unique, foreign key constraints
Catalog by Relation Enabled for better metadata management

Table Properties Configuration

Hologres-specific table properties can be configured in your model:

Property Description
orientation Storage direction: column or row
distribution_key Distribution key for data sharding
clustering_key Clustering key for query optimization
event_time_column / segment_key Event time column for time-series data
bitmap_columns Bitmap index columns
dictionary_encoding_columns Dictionary encoding columns

Example configuration:

models:
  my_model:
    materialized: table
    orientation: column
    distribution_key: user_id
    clustering_key: created_at
    event_time_column: created_at
    bitmap_columns: status,type
    dictionary_encoding_columns: category

Supported Data Types

The adapter supports standard PostgreSQL types plus Hologres-specific types:

Type Description Example
roaringbitmap Compressed bitmap for efficient set operations rb_build(array[1,2,3])
json JSON data type (text format) '{"key": "value"}'::json
jsonb Binary JSON format with index support '{"key": "value"}'::jsonb
integer[] Integer array type array[1, 2, 3]
text[] Text array type array['a', 'b', 'c']

RoaringBitmap Example:

{{ config(materialized='table') }}

select
    1 as id,
    rb_build(array[1,2,3,4,5]) as user_ids  -- Creates a bitmap for user tagging

Note: RoaringBitmap columns require materialized='table' for proper storage.

Dynamic Tables

Dynamic Tables are Hologres's implementation of materialized views with automatic refresh:

models:
  my_model:
    materialized: dynamic_table
    freshness: "30 minutes"
    auto_refresh_mode: auto
    computing_resource: serverless

Supported configurations:

  • freshness: Data freshness requirement (e.g., "30 minutes", "1 hours")
  • auto_refresh_mode: auto, incremental, or full
  • computing_resource: serverless, local, or warehouse name
  • Logical partitioning support for time-series data

Logical Partition Tables

Logical Partition Tables enable efficient data management and query optimization:

models:
  my_model:
    materialized: table
    logical_partition_key: 'ds'  # Single partition key
    # or for multiple keys:
    # logical_partition_key: 'order_year, order_month'

Supported configurations:

  • logical_partition_key: Partition column(s), supports 1-2 columns separated by comma
  • Supported types: INT, TEXT, VARCHAR, DATE, TIMESTAMP, TIMESTAMPTZ
  • Partition keys are automatically set to NOT NULL
  • Works with table properties like orientation, distribution_key, etc.

LocalDate Date Utilities

The adapter provides a powerful LocalDate class for date manipulation in Jinja2 templates:

Basic Usage

{%- set ds = adapter.parse_date('2024-03-15') -%}

{# Date arithmetic #}
{{ ds.sub_days(7) }}        {# 2024-03-08 #}
{{ ds.add_months(2) }}      {# 2024-05-15 #}

{# Period boundaries #}
{{ ds.start_of_month() }}   {# 2024-03-01 #}
{{ ds.end_of_quarter() }}   {# 2024-03-31 #}

Chained Operations

{%- set ds = adapter.parse_date('2024-03-15') -%}
{%- set start_date = ds.sub_months(2).start_of_month() -%}
{{ start_date }}  {# 2024-01-01 #}

Available Methods

Category Methods
Arithmetic add_days(), sub_days(), add_months(), sub_months(), add_years(), sub_years()
Period Boundaries start_of_month(), end_of_month(), start_of_quarter(), end_of_quarter(), start_of_year(), end_of_year(), start_of_week(), end_of_week()
Formatting format(), str(), to_sql()
Comparison is_before(), is_after(), is_equal(), days_between()
Properties year, month, day, quarter, day_of_year

Helper Functions

{# Get today's date #}
{%- set today = adapter.today() -%}

{# Parse date from string #}
{%- set ds = adapter.parse_date('2024-06-15') -%}
{%- set ds = adapter.parse_date('2024/06/15') -%}   {# Slash format #}
{%- set ds = adapter.parse_date('20240615') -%}     {# Compact format #}

SQL Macros

The adapter provides various SQL macros for Hologres-specific operations:

CTAS with Properties

{%- set with_properties = [
    "orientation = 'column'",
    "distribution_key = 'id'"
] -%}
create table {{ relation }}
with (
  {{ with_properties | join(',\n  ') }}
)
as (
  {{ compiled_code }}
);

Logical Partition Table DDL

create table {{ relation }} (
  {{ column_definitions }}
)
logical partition by list ({{ partition_columns }});

Date Utility Macros

The following macros are available for date operations:

Macro Description
parse_date(date_input) Parse date string into LocalDate object
local_date(date_input) Alias for parse_date(), providing more intuitive naming
today() Get today's date as LocalDate
ds() Get execution date from EXECUTION_DATE variable or environment
format_date(local_date, format_str) Format LocalDate to string

Example usage in SQL:

{# Using local_date macro with to_sql() for SQL-compatible output #}
select
    {{ local_date('2024-01-15').to_sql() }} as base_date,
    {{ local_date('2024-01-15').sub_days(7).to_sql() }} as week_ago

{# Date range generation #}
{%- set start = adapter.parse_date('2024-01-01') -%}
{%- set end = adapter.parse_date('2024-01-05') -%}
{%- set days = start.days_between(end) -%}

{# Format date #}
{{ ds.format('%Y%m%d') }}

Connection Parameters

Parameter Required Default Description
host Yes - Hologres instance hostname
port No 80 Port number
user Yes - Username (case-sensitive)
password Yes - Password (case-sensitive)
database Yes - Database name
schema Yes "" Default schema (use empty string "" if not needed)
threads No 1 Number of threads for parallel execution
connect_timeout No 10 Connection timeout in seconds
sslmode No disable SSL mode (disabled by default)
application_name No dbt_hologres_{version} Application identifier
retries No 1 Number of connection retries

Testing Your Connection

Run dbt debug to verify your connection:

dbt debug

Example Project Structure

my_hologres_project/
├── dbt_project.yml
├── profiles.yml
├── models/
│   ├── staging/
│   │   └── stg_orders.sql
│   ├── marts/
│   │   └── fct_orders.sql
│   └── schema.yml
└── tests/
    └── assert_positive_order_total.sql

Important Notes

  1. Case Sensitivity: Hologres usernames and passwords are case-sensitive
  2. Default Port: Default port is 80 (not 5432 like PostgreSQL)
  3. SSL Mode: SSL is disabled by default for Hologres connections
  4. Psycopg3: This adapter uses Psycopg 3, which has API differences from Psycopg 2
  5. Model Name Restrictions: Model names must not exceed 27 characters and are case-insensitive (converted to lowercase)

Supported dbt Versions

  • dbt-core >= 1.8.0
  • Python >= 3.10

Unit Tests

This project includes comprehensive unit tests with mocked database connections.

Test Coverage Overview

Test File Test Classes Test Methods Coverage Area
test_adapter.py 5 29 Adapter core functionality and configuration
test_connection.py 5 28 Connection management and credentials
test_relation.py 3 18 Relation objects and index management
test_column.py 1 8 Column handling and data types
test_local_date.py 12 64 LocalDate date utilities
test_index_config.py 2 18 Index configuration
test_dynamic_table_config.py 2 16 Dynamic table configuration
test_sql_macros.py 6 27 SQL macro rendering
test_logical_partition.py 5 22 Logical partition tables
test_date_utils_macros.py 9 33 Date utility macros
test_exception_handling.py 7 23 Exception handling
test_edge_cases.py 7 64 Edge cases and boundary conditions
Total 61 ~350

Test Class Descriptions

test_adapter.py

  • TestHologresAdapter: Core adapter functionality, incremental strategies, timestamp operations
  • TestHologresConfig: Table property configuration (orientation, distribution_key, etc.)
  • TestHologresAdapterParseDate: Date parsing functionality
  • TestHologresAdapterToday: Current date retrieval
  • TestHologresAdapterTimestampAdd: Timestamp interval SQL generation

test_connection.py

  • TestHologresCredentials: Credential validation and defaults
  • TestHologresConnectionManager: Connection lifecycle management
  • TestHologresConnectionManagerOpen: Connection opening behavior
  • TestHologresCredentialsValidation: Credential boundary validation
  • TestGetResponse: SQL response parsing

test_relation.py

  • TestHologresRelation: Relation creation and properties
  • TestGetIndexConfigChanges: Index configuration change detection
  • TestDynamicTableConfigChanges: Dynamic table configuration changes

test_local_date.py

  • TestLocalDateCreation: Date object creation from various inputs
  • TestLocalDateSubtraction: Date subtraction operations
  • TestLocalDateAddition: Date addition operations
  • TestLocalDatePeriodBoundaries: Month, quarter, year boundaries
  • TestLocalDateChaining: Method chaining support
  • TestLocalDateProperties: Year, month, day, quarter properties
  • TestLocalDateFormatting: Date format string generation
  • TestLocalDateComparison: Date comparison methods
  • TestParseDateFunction: parse_date helper function
  • TestTodayFunction: today() helper function
  • TestLocalDateWeekMethods: Week boundary methods
  • TestLocalDateDayOfYear: Day of year property

test_sql_macros.py

  • TestAdaptersMacros: CTAS and DDL statement rendering
  • TestRelationOperations: Drop, truncate, rename operations
  • TestSchemaOperations: Schema creation and deletion
  • TestViewOperations: View creation and management
  • TestInsertOperations: Insert statement generation
  • TestTimestampOperations: Timestamp interval operations

Running Tests

Using Hatch (Recommended)

Hatch is the recommended way to run tests:

# Install hatch
pip install hatch

# Run all unit tests
hatch -e cd run unit-tests

# Run a specific test file
hatch -e cd run unit-tests tests/unit/test_adapter.py

# Run a specific test class
hatch -e cd run unit-tests tests/unit/test_adapter.py::TestHologresAdapter

# Run a specific test method
hatch -e cd run unit-tests tests/unit/test_adapter.py::TestHologresAdapter::test_date_function

Using pytest Directly

You can also run tests directly with pytest:

# Install dependencies
pip install -e .
pip install pytest pytest-mock pytest-xdist freezegun

# Run all unit tests
python -m pytest tests/unit -v

# Run with parallel execution
python -m pytest tests/unit -v -n auto

# Run a specific test file
python -m pytest tests/unit/test_connection.py -v

# Run with coverage report
python -m pytest tests/unit --cov=src/dbt/adapters/hologres --cov-report=term-missing

Test Markers

Marker Description
unit Unit tests (default for tests/unit/)
integration Integration tests (requires database connection)
slow Slow-running tests
smoke Basic smoke tests

Integration Tests

Integration tests require an actual Hologres database connection and perform real database operations including creating, updating, and dropping tables.

Prerequisites

Before running integration tests, configure your Hologres connection using one of the following methods:

Method 1: Using test.env file (Recommended)

  1. Copy the example environment file:
cp test.env.example test.env
  1. Edit test.env and fill in your actual Hologres connection details:
# Hologres instance configuration
DBT_HOLOGRES_HOST=your_hologres_instance.hologres.aliyuncs.com
DBT_HOLOGRES_PORT=80
DBT_HOLOGRES_USER='BASIC$your_username'
DBT_HOLOGRES_PASSWORD='your_password'
DBT_HOLOGRES_DATABASE='your_database'
DBT_HOLOGRES_SCHEMA='test_schema'

# Enable integration tests
DBT_HOLOGRES_RUN_INTEGRATION_TESTS=true
  1. Load the environment variables before running tests:
# Load environment variables from test.env
export $(cat test.env | grep -v '^#' | xargs)

# Run integration tests
pytest tests/integration/

Method 2: Setting environment variables directly

export DBT_HOLOGRES_RUN_INTEGRATION_TESTS=true
export DBT_HOLOGRES_HOST=your_hologres_instance.hologres.aliyuncs.com
export DBT_HOLOGRES_PORT=80
export DBT_HOLOGRES_USER=your_username
export DBT_HOLOGRES_PASSWORD=your_password
export DBT_HOLOGRES_DATABASE=your_database
export DBT_HOLOGRES_SCHEMA=test_schema  # Optional, defaults to 'test_schema'

Running Integration Tests

# Run all integration tests
pytest tests/integration/

# Run specific integration test
pytest tests/integration/test_table_operations.py

# Run with verbose output
pytest tests/integration/ -v

# Run only table operation tests
pytest tests/integration/test_table_operations.py -v

# Run only view operation tests
pytest tests/integration/test_view_operations.py -v

# Run only Hologres-specific feature tests
pytest tests/integration/test_hologres_features.py -v

Integration Test Structure

The integration test suite includes:

  • test_table_operations.py: Tests for table creation, updates, deletion, and incremental models
  • test_view_operations.py: Tests for view creation, dependencies, and conversions
  • test_hologres_features.py: Tests for Hologres-specific features like indexes, dynamic tables, and partitioning

Each test uses an isolated schema to ensure tests don't interfere with each other. Test schemas are automatically cleaned up after each test run.

Test Isolation

Integration tests create unique schemas for each test to ensure isolation:

  • Each test gets a unique schema name (e.g., test_a1b2c3d4e5f6g7h8i9j0)
  • Tests clean up their schemas automatically after completion
  • Failed tests still attempt cleanup

Functional Tests

Functional tests use the official dbt test framework (dbt-tests-adapter) to verify end-to-end dbt operations against a Hologres database.

Test Structure

tests/functional/
├── conftest.py              # dbt test framework configuration
├── fixtures.py              # Shared test models and seeds
├── test_basic.py            # Basic dbt run, seed, compile tests
├── test_materializations/
│   ├── test_table.py        # Table materialization tests
│   ├── test_incremental.py  # Incremental model tests
│   └── test_dynamic_table.py # Dynamic table tests
├── test_logical_partition.py # Logical partition tests
└── test_date_utils.py       # Date utility macro tests

Running Functional Tests

# Set required environment variables
export DBT_HOLOGRES_RUN_FUNCTIONAL_TESTS=true
export DBT_HOLOGRES_HOST=your-hologres-host
export DBT_HOLOGRES_USER=your-user
export DBT_HOLOGRES_PASSWORD=your-password
export DBT_HOLOGRES_DATABASE=your-database

# Run using Hatch
hatch -e cd run integration-tests

# Or run directly with pytest
python -m pytest tests/functional/ -v

Test Coverage

Test File Test Classes Coverage
test_basic.py 5 Basic run, seed, compile, test operations
test_table.py 9 Table materialization with indexes, properties
test_incremental.py 8 Incremental strategies (append, merge, delete+insert)
test_dynamic_table.py 8 Dynamic table creation and auto-refresh
test_logical_partition.py 9 Logical partition with single/multiple keys
test_date_utils.py 10 LocalDate operations and date macros

Building and Publishing

Build

This project uses Hatchling as the build backend.

# Install hatch (if not already installed)
pip install hatch

# Build wheel and sdist (output to dist/)
hatch build

# Build wheel only
hatch build -t wheel

Verify

Validate the built package before publishing:

hatch run build:check-all

This runs twine check and installation verification to ensure package quality.

Publish to PyPI

# Install twine (if not already installed)
pip install twine

# Upload to PyPI
twine upload dist/*

You need to configure PyPI credentials beforehand, either via ~/.pypirc or environment variables:

export TWINE_USERNAME=__token__
export TWINE_PASSWORD=your-pypi-api-token

Version Management

The version number is maintained in a single file:

src/dbt/adapters/hologres/__version__.py

Hatchling reads the version automatically from this file during build — no need to update it elsewhere.

Release Checklist

  1. Update version in src/dbt/adapters/hologres/__version__.py
  2. Build: hatch build
  3. Verify: hatch run build:check-all
  4. Publish: twine upload dist/*
  5. Tag: git tag v<version> && git push origin v<version>

Resources

License

Apache License 2.0

Support

For issues and questions:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_alibaba_cloud_hologres-1.1.2.tar.gz (88.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_alibaba_cloud_hologres-1.1.2-py3-none-any.whl (40.9 kB view details)

Uploaded Python 3

File details

Details for the file dbt_alibaba_cloud_hologres-1.1.2.tar.gz.

File metadata

File hashes

Hashes for dbt_alibaba_cloud_hologres-1.1.2.tar.gz
Algorithm Hash digest
SHA256 116334e71c1d9632cd3642e9727b452a3e2ccf4c13201c4e4098a40b804e9f28
MD5 a801af107b19855519647731f346f5b9
BLAKE2b-256 ff7fa5f50d12573b225f8404e554e49841820f1aa225908c8d22dbaa144580e4

See more details on using hashes here.

File details

Details for the file dbt_alibaba_cloud_hologres-1.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for dbt_alibaba_cloud_hologres-1.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5b2c3b36dd6caa5f1b693cfa3943e8f7733fcfd4774ba88ead7def4016cce7cd
MD5 cda18342d48d8b300de51a9094f0a683
BLAKE2b-256 ac2ca2fe9d21019e4f3dacf350e4178b175aa42b4e7f92de215b0e10fbf61415

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page