Skip to main content

The Teradata adapter plugin for dbt (data build tool)

Project description

dbt-teradata

This plugin ports dbt functionality to Teradata Vantage.

Installation

pip install dbt-teradata

If you are new to dbt on Teradata see dbt with Teradata Vantage tutorial.

Sample profile

Here is a working example of a dbt-teradata profile:

my-teradata-db-profile:
  target: dev
  outputs:
    dev:
      type: teradata
      host: localhost
      user: dbc
      password: dbc
      schema: dbt_test
      tmode: ANSI

At a minimum, you need to specify host, user, password, schema (database), tmode.

Python compatibility

Plugin version Python 3.6 Python 3.7 Python 3.8 Python 3.9 Python 3.10
0.19.0.x
0.20.0.x
0.21.1.x
1.0.0.x

Optional profile configurations

Logmech

The logon mechanism for Teradata jobs that dbt executes can be configured with the logmech configuration in your Teradata profile. The logmech field can be set to: TD2, LDAP, KRB5, TDNEGO. For more information on authentication options, go to Teradata Vantage authentication documentation.

my-teradata-db-profile:
  target: dev
  outputs:
    dev:
      type: teradata
      host: <host>
      user: <user>
      password: <password>
      schema: dbt_test
      tmode: ANSI
      logmech: LDAP

Port

If your Teradata database runs on port different than the default (1025), you can specify a custom port in your dbt profile using port configuration.

my-teradata-db-profile:
  target: dev
  outputs:
    dev:
      type: teradata
      host: <host>
      user: <user>
      password: <password>
      schema: dbt_test
      tmode: ANSI
      port: <port>

Other Teradata connection parameters

The plugin also supports the following Teradata connection parameters:

  • account
  • column_name
  • cop
  • coplast
  • encryptdata
  • fake_result_sets
  • field_quote
  • field_sep
  • lob_support
  • log
  • logdata
  • max_message_body
  • partition
  • sip_support
  • teradata_values

For full description of the connection parameters see https://github.com/Teradata/python-driver#connection-parameters.

Supported Features

Materializations

  • view
  • table
  • ephemeral
  • incremental

Commands

All dbt commands are supported.

Custom configurations

General

  • Enable view column types in docs - Teradata Vantage has a dbscontrol configuration flag called DisableQVCI (QVCI - Queryable View Column Index). This flag instructs the database to build DBC.ColumnsJQV with view column type definitions.

    :information_source: Existing customers, please see KB0022230 for more information about enabling QVCI.

    To enable this functionality you need to:

    1. Enable QVCI mode in Vantage. Use dbscontrol utility and then restart Teradata. Run these commands as a privileged user on a Teradata node:
      # option 551 is DisableQVCI. Setting it to false enables QVCI.
      dbscontrol << EOF
      M internal 551=false
      W
      EOF
      
      # restart Teradata
      tpareset -y Enable QVCI
      
    2. Instruct dbt to use QVCI mode. Include the following variable in your dbt_project.yml:
      vars:
        use_qvci: true
      
      For example configuration, see test/catalog/with_qvci/dbt_project.yml.

Models

Table
  • table_kind - define the table kind. Legal values are MULTISET (default for ANSI transaction mode required by dbt-teradata) and SET, e.g.:

    {{
      config(
          materialized="table",
          table_kind="SET"
      )
    }}
    

    For details, see CREATE TABLE documentation.

  • table_option - define table options. Legal values are:

    { MAP = map_name [COLOCATE USING colocation_name |
      [NO] FALLBACK [PROTECTION] |
      WITH JOURNAL TABLE = table_specification |
      [NO] LOG |
      [ NO | DUAL ] [BEFORE] JOURNAL |
      [ NO | DUAL | LOCAL | NOT LOCAL ] AFTER JOURNAL |
      CHECKSUM = { DEFAULT | ON | OFF } |
      FREESPACE = integer [PERCENT] |
      mergeblockratio |
      datablocksize |
      blockcompression |
      isolated_loading
    }
    

    where:

    • mergeblockratio:
      { DEFAULT MERGEBLOCKRATIO |
        MERGEBLOCKRATIO = integer [PERCENT] |
        NO MERGEBLOCKRATIO
      }
      
    • datablocksize:
      DATABLOCKSIZE = {
        data_block_size [ BYTES | KBYTES | KILOBYTES ] |
        { MINIMUM | MAXIMUM | DEFAULT } DATABLOCKSIZE
      }
      
    • blockcompression:
      BLOCKCOMPRESSION = { AUTOTEMP | MANUAL | ALWAYS | NEVER | DEFAULT }
        [, BLOCKCOMPRESSIONALGORITHM = { ZLIB | ELZS_H | DEFAULT } ]
        [, BLOCKCOMPRESSIONLEVEL = { value | DEFAULT } ]
      
    • isolated_loading:
      WITH [NO] [CONCURRENT] ISOLATED LOADING [ FOR { ALL | INSERT | NONE } ]
      

    Examples:

    {{
      config(
          materialized="table",
          table_option="NO FALLBACK"
      )
    }}
    
    {{
      config(
          materialized="table",
          table_option="NO FALLBACK, NO JOURNAL"
      )
    }}
    
    {{
      config(
          materialized="table",
          table_option="NO FALLBACK, NO JOURNAL, CHECKSUM = ON,
            NO MERGEBLOCKRATIO,
            WITH CONCURRENT ISOLATED LOADING FOR ALL"
      )
    }}
    

    For details, see CREATE TABLE documentation.

  • with_statistics - should statistics be copied from the base table, e.g.:

    {{
      config(
          materialized="table",
          with_statistics="true"
      )
    }}
    

    For details, see CREATE TABLE documentation.

  • index - defines table indices:

    [UNIQUE] PRIMARY INDEX [index_name] ( index_column_name [,...] ) |
    NO PRIMARY INDEX |
    PRIMARY AMP [INDEX] [index_name] ( index_column_name [,...] ) |
    PARTITION BY { partitioning_level | ( partitioning_level [,...] ) } |
    UNIQUE INDEX [ index_name ] [ ( index_column_name [,...] ) ] [loading] |
    INDEX [index_name] [ALL] ( index_column_name [,...] ) [ordering] [loading]
    [,...]
    

    where:

    • partitioning_level:
      { partitioning_expression |
        COLUMN [ [NO] AUTO COMPRESS |
        COLUMN [ [NO] AUTO COMPRESS ] [ ALL BUT ] column_partition ]
      } [ ADD constant ]
      
    • ordering:
      ORDER BY [ VALUES | HASH ] [ ( order_column_name ) ]
      
    • loading:
      WITH [NO] LOAD IDENTITY
      

    e.g.:

    {{
      config(
          materialized="table",
          index="UNIQUE PRIMARY INDEX ( GlobalID )"
      )
    }}
    

    :information_source: Note, unlike in table_option, there are no commas between index statements!

    {{
      config(
          materialized="table",
          index="PRIMARY INDEX(id)
          PARTITION BY RANGE_N(create_date
                        BETWEEN DATE '2020-01-01'
                        AND     DATE '2021-01-01'
                        EACH INTERVAL '1' MONTH)"
      )
    }}
    

    :information_source: Note, unlike in table_option, there are no commas between index statements!

    {{
      config(
          materialized="table",
          index="PRIMARY INDEX(id)
          PARTITION BY RANGE_N(create_date
                        BETWEEN DATE '2020-01-01'
                        AND     DATE '2021-01-01'
                        EACH INTERVAL '1' MONTH)
          INDEX index_attrA (attrA) WITH LOAD IDENTITY"
      )
    }}
    

Seeds

  • use_fastload - use fastload when handling dbt seed command. The option will likely speed up loading when your seed files have hundreds of thousands of rows. You can set this seed configuration option in your project.yml file, e.g.:
    seeds:
      <project-name>:
        +use_fastload: true
    

Common Teradata-specific tasks

  • collect statistics - when a table is created or modified significantly, there might be a need to tell Teradata to collect statistics for the optimizer. It can be done using COLLECT STATISTICS command. You can perform this step using dbt's post-hooks, e.g.:
    {{ config(
      post_hook=[
        "COLLECT STATISTICS ON  {{ this }} COLUMN (column_1,  column_2  ...);"
        ]
    )}}
    
    See Collecting Statistics documentation for more information.

Limitations

Transaction mode

Only ANSI transaction mode is supported.

Credits

The adapter was originally created by Doug Beatty. Teradata took over the adapter in January 2022. We are grateful to Doug for founding the project and accelerating the integration of dbt + Teradata.

License

The adapter is published using Apache-2.0 License. Please see the license for terms and conditions, such as creating derivative work and the support model.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt-teradata-1.0.0.1.tar.gz (26.2 kB view details)

Uploaded Source

Built Distribution

dbt_teradata-1.0.0.1-py3-none-any.whl (27.3 kB view details)

Uploaded Python 3

File details

Details for the file dbt-teradata-1.0.0.1.tar.gz.

File metadata

  • Download URL: dbt-teradata-1.0.0.1.tar.gz
  • Upload date:
  • Size: 26.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for dbt-teradata-1.0.0.1.tar.gz
Algorithm Hash digest
SHA256 6b8708d41388bf4aafc7e1841ab77c2d1a45944621e7a5b38c3034eee8552a1d
MD5 d5b7ea8a5b21e9acf9b20b8654896d49
BLAKE2b-256 28cd6ae1960557968a0ad798d0bcc2f4c0e124be36a0719849356bd3f37019fa

See more details on using hashes here.

File details

Details for the file dbt_teradata-1.0.0.1-py3-none-any.whl.

File metadata

  • Download URL: dbt_teradata-1.0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 27.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for dbt_teradata-1.0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6eca571d797d34a206cd7f6707726a488b5a87f629fa6d8e4c8f07d14c8a8679
MD5 1176d9a088c2c91e0ea748175dc23a57
BLAKE2b-256 f2aa6779c6209bace0c9cb595f77978b4a4a0215b2835e2f831627100d431e26

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page