Skip to main content

The Hologres adapter plugin for dbt

Project description

dbt-hologres

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

dbt is the T in ELT. Organize, cleanse, denormalize, filter, rename, and pre-aggregate the raw data in your warehouse so that it's ready for analysis.

dbt-hologres

dbt-hologres enables dbt to work with Alibaba Cloud Hologres, a real-time data warehouse compatible with PostgreSQL.

For more information on using dbt with Hologres, consult the dbt documentation.

Getting started

Installation

Install from PyPI

pip install dbt-alibaba-cloud-hologres

Install from Source

For development or to get the latest features, you can install directly from the source code:

# Clone the repository
git clone https://github.com/aliyun/dbt-hologres.git
cd dbt-hologres

# Install in editable mode
pip install --force-reinstall -e .

This allows you to:

  • Modify the adapter code and see changes immediately
  • Contribute to the project development
  • Test unreleased features

Configuration

Configure your profiles.yml file:

hologres_project:
  target: dev
  outputs:
    dev:
      type: hologres
      host: hgxxx-xx-xxx-xx-xxx.hologres.aliyuncs.com
      port: 80
      user: BASIC$your_username
      password: your_password
      database: your_database
      schema: ""  # Use empty string if no default schema needed
      threads: 4
      connect_timeout: 10
      sslmode: disable

Key Features

  • Full PostgreSQL Compatibility: Leverage familiar PostgreSQL syntax and features
  • Psycopg3 Driver: Uses the modern Psycopg 3 library for better performance
  • Dynamic Tables: Support for Hologres Dynamic Tables (materialized views with auto-refresh)
  • Incremental Models: Multiple strategies including append, delete+insert, merge, and microbatch
  • Constraints: Full support for primary keys, foreign keys, unique constraints, and more

Hologres-Specific Features

Dynamic Tables

Dynamic Tables are Hologres's implementation of materialized views with automatic refresh:

models:
  my_model:
    materialized: dynamic_table
    freshness: "30 minutes"
    auto_refresh_mode: auto
    computing_resource: serverless

Supported configurations:

  • freshness: Data freshness requirement (e.g., "30 minutes", "1 hours")
  • auto_refresh_mode: auto, incremental, or full
  • computing_resource: serverless, local, or warehouse name
  • Logical partitioning support for time-series data

Connection Parameters

Parameter Required Default Description
host Yes - Hologres instance hostname
port No 80 Port number
user Yes - Username (case-sensitive)
password Yes - Password (case-sensitive)
database Yes - Database name
schema Yes "" Default schema (use empty string "" if not needed)
threads No 1 Number of threads for parallel execution
connect_timeout No 10 Connection timeout in seconds
sslmode No disable SSL mode (disabled by default)
application_name No dbt_hologres_{version} Application identifier
retries No 1 Number of connection retries

Testing Your Connection

Run dbt debug to verify your connection:

dbt debug

Example Project Structure

my_hologres_project/
├── dbt_project.yml
├── profiles.yml
├── models/
│   ├── staging/
│   │   └── stg_orders.sql
│   ├── marts/
│   │   └── fct_orders.sql
│   └── schema.yml
└── tests/
    └── assert_positive_order_total.sql

Important Notes

  1. Case Sensitivity: Hologres usernames and passwords are case-sensitive
  2. Default Port: Default port is 80 (not 5432 like PostgreSQL)
  3. SSL Mode: SSL is disabled by default for Hologres connections
  4. Psycopg3: This adapter uses Psycopg 3, which has API differences from Psycopg 2

Supported dbt Versions

  • dbt-core >= 1.8.0
  • Python >= 3.11

Running Tests

This project includes both unit tests and integration tests.

Unit Tests

Unit tests use mocked database connections and can be run without a Hologres instance:

# Run all unit tests
pytest tests/unit/

# Run a specific test file
pytest tests/unit/test_connection.py

# Run with verbose output
pytest tests/unit/ -v

Integration Tests

Integration tests require an actual Hologres database connection and perform real database operations including creating, updating, and dropping tables.

Prerequisites

Before running integration tests, configure your Hologres connection using one of the following methods:

Method 1: Using test.env file (Recommended)

  1. Copy the example environment file:
cp test.env.example test.env
  1. Edit test.env and fill in your actual Hologres connection details:
# Hologres instance configuration
DBT_HOLOGRES_HOST=your_hologres_instance.hologres.aliyuncs.com
DBT_HOLOGRES_PORT=80
DBT_HOLOGRES_USER='BASIC$your_username'
DBT_HOLOGRES_PASSWORD='your_password'
DBT_HOLOGRES_DATABASE='your_database'
DBT_HOLOGRES_SCHEMA='test_schema'

# Enable integration tests
DBT_HOLOGRES_RUN_INTEGRATION_TESTS=true
  1. Load the environment variables before running tests:
# Load environment variables from test.env
export $(cat test.env | grep -v '^#' | xargs)

# Run integration tests
pytest tests/integration/

Method 2: Setting environment variables directly

export DBT_HOLOGRES_RUN_INTEGRATION_TESTS=true
export DBT_HOLOGRES_HOST=your_hologres_instance.hologres.aliyuncs.com
export DBT_HOLOGRES_PORT=80
export DBT_HOLOGRES_USER=your_username
export DBT_HOLOGRES_PASSWORD=your_password
export DBT_HOLOGRES_DATABASE=your_database
export DBT_HOLOGRES_SCHEMA=test_schema  # Optional, defaults to 'test_schema'

Running Integration Tests

# Run all integration tests
pytest tests/integration/

# Run specific integration test
pytest tests/integration/test_table_operations.py

# Run with verbose output
pytest tests/integration/ -v

# Run only table operation tests
pytest tests/integration/test_table_operations.py -v

# Run only view operation tests
pytest tests/integration/test_view_operations.py -v

# Run only Hologres-specific feature tests
pytest tests/integration/test_hologres_features.py -v

Integration Test Structure

The integration test suite includes:

  • test_table_operations.py: Tests for table creation, updates, deletion, and incremental models
  • test_view_operations.py: Tests for view creation, dependencies, and conversions
  • test_hologres_features.py: Tests for Hologres-specific features like indexes, dynamic tables, and partitioning

Each test uses an isolated schema to ensure tests don't interfere with each other. Test schemas are automatically cleaned up after each test run.

Test Isolation

Integration tests create unique schemas for each test to ensure isolation:

  • Each test gets a unique schema name (e.g., test_a1b2c3d4e5f6g7h8i9j0)
  • Tests clean up their schemas automatically after completion
  • Failed tests still attempt cleanup

Resources

License

Apache License 2.0

Support

For issues and questions:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_alibaba_cloud_hologres-1.0.0.tar.gz (38.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_alibaba_cloud_hologres-1.0.0-py3-none-any.whl (28.4 kB view details)

Uploaded Python 3

File details

Details for the file dbt_alibaba_cloud_hologres-1.0.0.tar.gz.

File metadata

File hashes

Hashes for dbt_alibaba_cloud_hologres-1.0.0.tar.gz
Algorithm Hash digest
SHA256 69f994f5d255914b7e68e6668b4cf2468e209a1cad966aaeb1a91e434cec4b29
MD5 b208e374302600ae5030ebd3e6ec3f91
BLAKE2b-256 97294765abe0c5e7249f101f92e6064852dbad8e15de2d448e489917487f8501

See more details on using hashes here.

File details

Details for the file dbt_alibaba_cloud_hologres-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for dbt_alibaba_cloud_hologres-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2f2bef0258da8d7bfd68ccf1d6a35452e12b1ced2ad62513600b789a542eab75
MD5 cb0bd7afd3f926538551c879215dda40
BLAKE2b-256 a7c9a1fc7a4fabd4b55fd2e049375202fcd2c4ed140d1199300c069816e52004

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page