Skip to main content

dbt adapter for HatiData — Snowflake-compatible in-VPC data warehouse

Project description

dbt-hatidata

PyPI version Python versions License

The dbt adapter for HatiData In-VPC Data Warehouse.

HatiData is a Postgres wire-compatible, Snowflake SQL-compatible data warehouse that runs entirely inside your VPC. dbt-hatidata lets you use dbt to build, test, and document your data models on HatiData with full support for Snowflake SQL syntax -- no query rewrites required.

Installation

pip install dbt-hatidata

Requires Python 3.9 or later and dbt-core 1.7+.

Configuration

Add a HatiData target to your ~/.dbt/profiles.yml:

my_project:
  target: dev
  outputs:
    dev:
      type: hatidata
      host: "{{ env_var('HATIDATA_HOST', 'localhost') }}"
      port: 5439
      user: "{{ env_var('HATIDATA_USER', 'analyst') }}"
      password: "{{ env_var('HATIDATA_API_KEY') }}"
      database: iceberg_catalog
      schema: analytics
      environment: development
      api_key: "{{ env_var('HATIDATA_API_KEY') }}"
      auto_transpile: true
      threads: 4
      connect_timeout: 30

Connection Parameters

Parameter Description Default
type Must be hatidata --
host HatiData proxy hostname or IP localhost
port HatiData proxy port 5439
user Username for authentication --
password API key used as password --
database Catalog name iceberg_catalog
schema Default schema for models analytics
environment HatiData environment (development, staging, production) development
api_key HatiData API key for control plane authentication --
auto_transpile Enable automatic Snowflake-to-DuckDB SQL transpilation true
threads Number of concurrent dbt threads 4
connect_timeout Connection timeout in seconds 30

Features

Snowflake SQL Compatibility

HatiData's built-in transpiler automatically converts Snowflake SQL to DuckDB-compatible SQL, so you can bring your existing Snowflake dbt models without changes:

  • Date/time functions: DATEADD, DATEDIFF, DATE_TRUNC, TO_DATE, TO_TIMESTAMP
  • String aggregation: LISTAGG with ordering and delimiter support
  • Semi-structured data: VARIANT type mapped to JSON, PARSE_JSON, GET_PATH, colon notation (col:field)
  • Window functions: QUALIFY clause for filtering window function results
  • Table functions: FLATTEN for exploding arrays and objects
  • Type casting: :: cast syntax, Snowflake type aliases (NUMBER, STRING, TIMESTAMP_NTZ)

Incremental Materializations

All standard dbt incremental strategies are supported:

  • append -- Insert new rows without deduplication
  • delete+insert -- Delete matching rows then insert (ideal for date-partitioned models)
  • merge -- Upsert using a unique key (uses DuckDB's INSERT ... ON CONFLICT)

Column Masking

HatiData's policy engine applies column-level masking rules automatically. Sensitive columns are masked at query time based on the authenticated user's role, and dbt models respect these policies transparently.

Audit Logging

Every query executed by dbt is recorded in HatiData's immutable audit log, including the dbt model name, invocation ID, and execution metadata. Use the HatiData dashboard or query_audit table to inspect dbt run history.

Supported dbt Commands

Command Status
dbt run Supported
dbt test Supported
dbt seed Supported
dbt snapshot Supported
dbt docs generate Supported
dbt docs serve Supported
dbt build Supported

Migrating from Snowflake

If you are migrating an existing dbt project from Snowflake to HatiData, the process is straightforward:

  1. Install dbt-hatidata alongside or in place of dbt-snowflake.
  2. Update your profiles.yml to use type: hatidata with the connection parameters shown above.
  3. Run dbt run -- HatiData's transpiler handles Snowflake SQL translation automatically.

Most Snowflake SQL constructs are supported out of the box. If the transpiler encounters an unsupported pattern, the AI healer attempts an automatic fix. Unsupported constructs are logged so you can address them incrementally.

For models that use Snowflake-specific macros (e.g., dbt_utils.surrogate_key), standard dbt packages continue to work since HatiData is Postgres wire-compatible.

Development

# Clone the repository
git clone https://github.com/hatidata/dbt-hatidata.git
cd dbt-hatidata

# Install in development mode
pip install -e ".[dev]"

# Run unit tests
python -m pytest tests/unit/ -v

# Run functional tests (requires a running HatiData proxy on port 5439)
HATIDATA_HOST=localhost RUN_FUNCTIONAL=true bash ci/run_dbt_tests.sh

Resources

License

Apache License 2.0. Copyright (c) Marviy Pte Ltd. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_hatidata-0.1.1.tar.gz (13.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbt_hatidata-0.1.1-py3-none-any.whl (16.4 kB view details)

Uploaded Python 3

File details

Details for the file dbt_hatidata-0.1.1.tar.gz.

File metadata

  • Download URL: dbt_hatidata-0.1.1.tar.gz
  • Upload date:
  • Size: 13.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for dbt_hatidata-0.1.1.tar.gz
Algorithm Hash digest
SHA256 fa019c33fd581809b47d6870eacb4e54465f35d66706da753cf9cbb97fbaaf86
MD5 24205c36920b528461c402c52b3f36c0
BLAKE2b-256 81f2872478b8c441051af4d86c95f86d174470dd47d96dd8409503bb6269c8e2

See more details on using hashes here.

File details

Details for the file dbt_hatidata-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: dbt_hatidata-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 16.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for dbt_hatidata-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 69bdabacad8dc6ee032c66fb24008c86a53c7458ad02326be9e23d09babf1d9e
MD5 8464de8c9b7065c44cba6eae9b1119a4
BLAKE2b-256 969ac0780790574b9d5d4f8ffdd4585a08d50e6454100b2d30a33688d97a4946

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page