The Teradata adapter plugin for dbt (data build tool)

These details have not been verified by PyPI

Project links

Homepage

Project description

dbt-teradata

This plugin ports dbt functionality to Teradata Vantage.

Installation

pip install dbt-teradata

If you are new to dbt on Teradata see dbt with Teradata Vantage tutorial.

Sample profile

Here is a working example of a dbt-teradata profile:

my-teradata-db-profile:
  target: dev
  outputs:
    dev:
      type: teradata
      host: localhost
      user: dbc
      password: dbc
      schema: dbt_test
      tmode: ANSI

At a minimum, you need to specify host, user, password, schema (database), tmode.

Python compatibility

Plugin version	Python 3.6	Python 3.7	Python 3.8	Python 3.9	Python 3.10
0.19.0.x	✅	✅	✅	❌	❌
0.20.0.x	✅	✅	✅	✅	❌
0.21.1.x	✅	✅	✅	✅	❌
1.0.0.x	❌	✅	✅	✅	❌
1.1.0.x	❌	✅	✅	✅	✅
1.2.0.x	❌	✅	✅	✅	✅
1.3.0.x	❌	✅	✅	✅	✅

dbt dependent packages version compatibility

dbt-teradta	dbt-core	dbt-teradata-util	dbt-util
1.2.x	1.2.x	0.1.0	0.9.x or below

Optional profile configurations

Logmech

The logon mechanism for Teradata jobs that dbt executes can be configured with the logmech configuration in your Teradata profile. The logmech field can be set to: TD2, LDAP, KRB5, TDNEGO. For more information on authentication options, go to Teradata Vantage authentication documentation.

my-teradata-db-profile:
  target: dev
  outputs:
    dev:
      type: teradata
      host: <host>
      user: <user>
      password: <password>
      schema: dbt_test
      tmode: ANSI
      logmech: LDAP

Logdata

The logon mechanism for Teradata jobs that dbt executes can be configured with the logdata configuration in your Teradata profile. Addtional data like secure token, distinguished Name, or a domain/realm name can be set in your Teradata profile using logdata. The logdata field can be set to: JWT, LDAP, KRB5, TDNEGO. logdata is not used with the TD2 mechanism.

my-teradata-db-profile:
  target: dev
  outputs:
    dev:
      type: teradata
      host: <host>
      schema: dbt_test
      tmode: ANSI
      logmech: LDAP
      logdata: 'authcid=username password=password'
      port: <port>

For more information on authentication options, go to Teradata Vantage authentication documentation

Stored Password Protection

Stored Password Protection enables an application to provide a connection password in encrypted form to the driver. The plugin supports Stored Password Protection feature through prefix ENCRYPTED_PASSWORD( either in password connection parameter or in logdata connection parameter.

password

my-teradata-db-profile:
  target: dev
  outputs:
    dev:
      type: teradata
      host: <host>
      user: <user>
      password: ENCRYPTED_PASSWORD(file:PasswordEncryptionKeyFileName,file:EncryptedPasswordFileName)
      schema: dbt_test
      tmode: ANSI
      port: <port>

logdata

my-teradata-db-profile:
  target: dev
  outputs:
    dev:
      type: teradata
      host: <host>
      schema: dbt_test
      tmode: ANSI
      logmech: LDAP
      logdata: 'authcid=username password=ENCRYPTED_PASSWORD(file:PasswordEncryptionKeyFileName,file:EncryptedPasswordFileName)'
      port: <port>

For full description of Stored Password Protection see https://github.com/Teradata/python-driver#StoredPasswordProtection.

Port

If your Teradata database runs on port different than the default (1025), you can specify a custom port in your dbt profile using port configuration.

my-teradata-db-profile:
  target: dev
  outputs:
    dev:
      type: teradata
      host: <host>
      user: <user>
      password: <password>
      schema: dbt_test
      tmode: ANSI
      port: <port>

Retries

Allows an adapter to automatically try again when the attempt to open a new connection on the database has a transient, infrequent error. This option can be set using the retries configuration. Default value is 0. The default wait period between connection attempts is one second. retry_timeout (seconds) option allows us to adjust this waiting period.

If retries is set to 3, the adapter will try to establish a new connection three times if an error occurs.

my-teradata-db-profile:
  target: dev
  outputs:
    dev:
      type: teradata
      host: <host>
      user: <user>
      password: <password>
      schema: dbt_test
      tmode: ANSI
      retries: 3
      retry_timeout: 10

Other Teradata connection parameters

The plugin also supports the following Teradata connection parameters:

account
column_name
cop
coplast
encryptdata
fake_result_sets
field_quote
field_sep
lob_support
log
logdata
max_message_body
partition
sip_support
teradata_values

For full description of the connection parameters see https://github.com/Teradata/python-driver#connection-parameters.

Supported Features

Materializations

view
table
ephemeral
incremental

Incremental Materialization

The following incremental materialization strategies are supported:

append (default)
delete+insert
merge

To learn more about dbt incremental strategies please check the dbt incremental strategy documentation.

Commands

All dbt commands are supported.

Custom configurations

General

Enable view column types in docs - Teradata Vantage has a dbscontrol configuration flag called DisableQVCI (QVCI - Queryable View Column Index). This flag instructs the database to build DBC.ColumnsJQV with view column type definitions.

:information_source: Existing customers, please see KB0022230 for more information about enabling QVCI.

To enable this functionality you need to:
1. Enable QVCI mode in Vantage. Use dbscontrol utility and then restart Teradata. Run these commands as a privileged user on a Teradata node:
```
# option 551 is DisableQVCI. Setting it to false enables QVCI.
dbscontrol << EOF
M internal 551=false
W
EOF

# restart Teradata
tpareset -y Enable QVCI
```
2. Instruct dbt to use QVCI mode. Include the following variable in your dbt_project.yml:
```
vars:
  use_qvci: true
```
  For example configuration, see test/catalog/with_qvci/dbt_project.yml.

Models

Table

The following options apply to table, snapshots and seed materializations.

table_kind - define the table kind. Legal values are MULTISET (default for ANSI transaction mode required by dbt-teradata) and SET, e.g.:
- in sql materialization definition file:
```
{{
  config(
      materialized="table",
      table_kind="SET"
  )
}}
```
- in seed configuration:
```
seeds:
  <project-name>:
    table_kind: "SET"
```
For details, see CREATE TABLE documentation.

table_option - define table options. Legal values are:

{ MAP = map_name [COLOCATE USING colocation_name] |
  [NO] FALLBACK [PROTECTION] |
  WITH JOURNAL TABLE = table_specification |
  [NO] LOG |
  [ NO | DUAL ] [BEFORE] JOURNAL |
  [ NO | DUAL | LOCAL | NOT LOCAL ] AFTER JOURNAL |
  CHECKSUM = { DEFAULT | ON | OFF } |
  FREESPACE = integer [PERCENT] |
  mergeblockratio |
  datablocksize |
  blockcompression |
  isolated_loading
}

where:

mergeblockratio:

{ DEFAULT MERGEBLOCKRATIO |
  MERGEBLOCKRATIO = integer [PERCENT] |
  NO MERGEBLOCKRATIO
}

datablocksize:

DATABLOCKSIZE = {
  data_block_size [ BYTES | KBYTES | KILOBYTES ] |
  { MINIMUM | MAXIMUM | DEFAULT } DATABLOCKSIZE
}

blockcompression:

BLOCKCOMPRESSION = { AUTOTEMP | MANUAL | ALWAYS | NEVER | DEFAULT }
  [, BLOCKCOMPRESSIONALGORITHM = { ZLIB | ELZS_H | DEFAULT } ]
  [, BLOCKCOMPRESSIONLEVEL = { value | DEFAULT } ]

isolated_loading:

WITH [NO] [CONCURRENT] ISOLATED LOADING [ FOR { ALL | INSERT | NONE } ]

Examples:

in sql materialization definition file:

{{
  config(
      materialized="table",
      table_option="NO FALLBACK"
  )
}}

{{
  config(
      materialized="table",
      table_option="NO FALLBACK, NO JOURNAL"
  )
}}

{{
  config(
      materialized="table",
      table_option="NO FALLBACK, NO JOURNAL, CHECKSUM = ON,
        NO MERGEBLOCKRATIO,
        WITH CONCURRENT ISOLATED LOADING FOR ALL"
  )
}}

in seed configuration:

seeds:
  <project-name>:
    table_option:"NO FALLBACK"

seeds:
  <project-name>:
    table_option:"NO FALLBACK, NO JOURNAL"

seeds:
  <project-name>:
    table_option: "NO FALLBACK, NO JOURNAL, CHECKSUM = ON,
      NO MERGEBLOCKRATIO,
      WITH CONCURRENT ISOLATED LOADING FOR ALL"

For details, see CREATE TABLE documentation.

with_statistics - should statistics be copied from the base table, e.g.:
```
{{
  config(
      materialized="table",
      with_statistics="true"
  )
}}
```
This option is not available for seeds as seeds do not use CREATE TABLE ... AS syntax.

For details, see CREATE TABLE documentation.

index - defines table indices:

[UNIQUE] PRIMARY INDEX [index_name] ( index_column_name [,...] ) |
NO PRIMARY INDEX |
PRIMARY AMP [INDEX] [index_name] ( index_column_name [,...] ) |
PARTITION BY { partitioning_level | ( partitioning_level [,...] ) } |
UNIQUE INDEX [ index_name ] [ ( index_column_name [,...] ) ] [loading] |
INDEX [index_name] [ALL] ( index_column_name [,...] ) [ordering] [loading]
[,...]

where:

partitioning_level:

{ partitioning_expression |
  COLUMN [ [NO] AUTO COMPRESS |
  COLUMN [ [NO] AUTO COMPRESS ] [ ALL BUT ] column_partition ]
} [ ADD constant ]

ordering:

ORDER BY [ VALUES | HASH ] [ ( order_column_name ) ]

loading:
```
WITH [NO] LOAD IDENTITY
```

e.g.:

in sql materialization definition file:

{{
  config(
      materialized="table",
      index="UNIQUE PRIMARY INDEX ( GlobalID )"
  )
}}

:information_source: Note, unlike in table_option, there are no commas between index statements!

{{
  config(
      materialized="table",
      index="PRIMARY INDEX(id)
      PARTITION BY RANGE_N(create_date
                    BETWEEN DATE '2020-01-01'
                    AND     DATE '2021-01-01'
                    EACH INTERVAL '1' MONTH)"
  )
}}

{{
  config(
      materialized="table",
      index="PRIMARY INDEX(id)
      PARTITION BY RANGE_N(create_date
                    BETWEEN DATE '2020-01-01'
                    AND     DATE '2021-01-01'
                    EACH INTERVAL '1' MONTH)
      INDEX index_attrA (attrA) WITH LOAD IDENTITY"
  )
}}

in seed configuration:

seeds:
  <project-name>:
    index: "UNIQUE PRIMARY INDEX ( GlobalID )"

:information_source: Note, unlike in table_option, there are no commas between index statements!

seeds:
  <project-name>:
    index: "PRIMARY INDEX(id)
      PARTITION BY RANGE_N(create_date
                    BETWEEN DATE '2020-01-01'
                    AND     DATE '2021-01-01'
                    EACH INTERVAL '1' MONTH)"

seeds:
  <project-name>:
    index: "PRIMARY INDEX(id)
      PARTITION BY RANGE_N(create_date
                    BETWEEN DATE '2020-01-01'
                    AND     DATE '2021-01-01'
                    EACH INTERVAL '1' MONTH)
      INDEX index_attrA (attrA) WITH LOAD IDENTITY"

Seeds

Seeds, in addition to the above materialization modifiers, have the following options:

use_fastload - use fastload when handling dbt seed command. The option will likely speed up loading when your seed files have hundreds of thousands of rows. You can set this seed configuration option in your project.yml file, e.g.:
```
seeds:
  <project-name>:
    +use_fastload: true
```

Grants

Grants are supported in dbt-teradata adapter with release version 1.2.0 and above. You can use grants to manage access to the datasets you're producing with dbt. To implement these permissions, define grants as resource configs on each model, seed, or snapshot. Define the default grants that apply to the entire project in your dbt_project.yml, and define model-specific grants within each model's SQL or YAML file.

for e.g. : models/schema.yml

models:
  - name: model_name
    config:
      grants:
        select: ['user_a', 'user_b']

Another e.g. for adding multiple grants:

models:
- name: model_name
  config:
    materialized: table
    grants:
      select: ["user_b"]
      insert: ["user_c"]

:information_source: copy_grants is not supported in Teradata.

More on Grants can be found at https://docs.getdbt.com/reference/resource-configs/grants

Cross DB macros

Starting with release 1.3, some macros were migrated from teradata-dbt-utils dbt package to the connector. See the table below for the macros supported from the connector.

For using cross DB macros, teradata-utils as a macro namespace will not be used, as cross DB macros have been migrated from teradata-utils to Dbt-Teradata.

Compatibility

Macro Group	Macro Name	Status	Comment
Cross-database macros	current_timestamp	:white_check_mark:	custom macro provided
Cross-database macros	dateadd	:white_check_mark:	custom macro provided
Cross-database macros	datediff	:white_check_mark:	custom macro provided, see compatibility note
Cross-database macros	split_part	:white_check_mark:	custom macro provided
Cross-database macros	date_trunc	:white_check_mark:	custom macro provided
Cross-database macros	hash	:white_check_mark:	custom macro provided, see compatibility note
Cross-database macros	replace	:white_check_mark:	custom macro provided
Cross-database macros	type_string	:white_check_mark:	custom macro provided
Cross-database macros	last_day	:white_check_mark:	no customization needed, see compatibility note
Cross-database macros	width_bucket	:white_check_mark:	no customization

examples for cross DB macros

Replace: {{ dbt.replace("string_text_column", "old_chars", "new_chars") }} {{ replace('abcgef', 'g', 'd') }}

Date truncate: {{ dbt.date_trunc("date_part", "date") }} {{ dbt.date_trunc("DD", "'2018-01-05 12:00:00'") }}

datediff

datediff macro in teradata supports difference between dates. Differece between timestamps is not supported.

hash

Hash macro needs an md5 function implementation. Teradata doesn't support md5 natively. You need to install a User Defined Function (UDF):

Download the md5 UDF implementation from Teradata (registration required): https://downloads.teradata.com/download/extensibility/md5-message-digest-udf.
Unzip the package and go to src directory.
Start up bteq and connect to your database.
Create database GLOBAL_FUNCTIONS that will host the UDF. You can't change the database name as it's hardcoded in the macro:
```
CREATE DATABASE GLOBAL_FUNCTIONS AS PERMANENT = 60e6, SPOOL = 120e6;
```

Create the UDF. Replace <CURRENT_USER> with your current database user:

GRANT CREATE FUNCTION ON GLOBAL_FUNCTIONS TO <CURRENT_USER>;
DATABASE GLOBAL_FUNCTIONS;
.run file = hash_md5.btq

Grant permissions to run the UDF with grant option.

GRANT EXECUTE FUNCTION ON GLOBAL_FUNCTIONS TO PUBLIC WITH GRANT OPTION;

last_day

last_day in teradata_utils, unlike the corresponding macro in dbt_utils, doesn't support quarter datepart.

Common Teradata-specific tasks

collect statistics - when a table is created or modified significantly, there might be a need to tell Teradata to collect statistics for the optimizer. It can be done using COLLECT STATISTICS command. You can perform this step using dbt's post-hooks, e.g.:
```
{{ config(
  post_hook=[
    "COLLECT STATISTICS ON  {{ this }} COLUMN (column_1,  column_2  ...);"
    ]
)}}
```
See Collecting Statistics documentation for more information.

Support for `dbt-utils` package

dbt-utils package is supported through teradata/teradata_utils dbt package. The package provides a compatibility layer between dbt_utils and dbt-teradata. See teradata_utils package for install instructions.

Limitations

Transaction mode

Only ANSI transaction mode is supported.

Credits

The adapter was originally created by Doug Beatty. Teradata took over the adapter in January 2022. We are grateful to Doug for founding the project and accelerating the integration of dbt + Teradata.

License

The adapter is published using Apache-2.0 License. Please see the license for terms and conditions, such as creating derivative work and the support model.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.8.1

Sep 5, 2024

1.8.0

Jun 18, 2024

1.8.0a1 pre-release

Jun 18, 2024

1.7.3

Jun 10, 2024

1.7.2

Apr 2, 2024

1.7.1

Jan 11, 2024

1.7.0

Dec 4, 2023

1.6.7

Nov 8, 2023

1.6.0

Sep 29, 2023

1.5.4

Aug 9, 2023

1.4.0.0

Jul 6, 2023

This version

1.3.3.1

Jun 8, 2023

1.3.3.0

Apr 18, 2023

1.3.0.0

Apr 7, 2023

1.2.5.0

Apr 18, 2023

1.2.0.0

Jan 30, 2023

1.1.5.0

Jun 6, 2023

1.1.4.0

Apr 18, 2023

1.1.0.0

Dec 6, 2022

1.0.6.0

May 11, 2022

1.0.5.0

May 11, 2022

1.0.4.0

May 10, 2022

1.0.3.0

May 10, 2022

1.0.1.3

May 10, 2022

1.0.1.2

Apr 29, 2022

1.0.1.1

Feb 5, 2022

1.0.1.0

Jan 31, 2022

1.0.0.1

Jan 31, 2022

1.0.0.0

Jan 18, 2022

1.0.0rc1 pre-release

Jan 14, 2022

0.21.1.2

Jan 14, 2022

0.21.1.1

Jan 12, 2022

0.20.0.1

Jan 12, 2022

0.19.0.6

Jan 11, 2022

0.19.0.5

Jan 10, 2022

0.19.0.4

Jan 9, 2022

0.19.0.3

Jan 8, 2022

0.19.0.2

Jan 8, 2022

0.19.0.1

Dec 14, 2021

0.19.0a0 pre-release

Apr 4, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt-teradata-1.3.3.1.tar.gz (36.0 kB view details)

Uploaded Jun 8, 2023 Source

Built Distribution

dbt_teradata-1.3.3.1-py3-none-any.whl (35.0 kB view details)

Uploaded Jun 8, 2023 Python 3

File details

Details for the file dbt-teradata-1.3.3.1.tar.gz.

File metadata

Download URL: dbt-teradata-1.3.3.1.tar.gz
Upload date: Jun 8, 2023
Size: 36.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for dbt-teradata-1.3.3.1.tar.gz
Algorithm	Hash digest
SHA256	`9017e692b5bb92df1c6f200701b1f23d604e486449d0fed6dc8fc76ed7636468`
MD5	`4c7c06cf0e333897a1f960b266be2da3`
BLAKE2b-256	`e5596fd8d8400a384d9c57f1548e5a219f4831262333e0992f484cbf58e67809`

See more details on using hashes here.

File details

Details for the file dbt_teradata-1.3.3.1-py3-none-any.whl.

File metadata

Download URL: dbt_teradata-1.3.3.1-py3-none-any.whl
Upload date: Jun 8, 2023
Size: 35.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for dbt_teradata-1.3.3.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`addc0d083c49a55c70d63adb0c6260492dd5556c8d989d4cec940cedd81f40bc`
MD5	`ba912146f27ed70f4fe97abfe4a21465`
BLAKE2b-256	`7913ad7f22c910f81b89eb61c3c044d5326101672c254f7636187a0b39608109`

See more details on using hashes here.

dbt-teradata 1.3.3.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

dbt-teradata

Installation

Sample profile

Python compatibility

dbt dependent packages version compatibility

Optional profile configurations

Logmech

Logdata

Stored Password Protection

Port

Retries

Other Teradata connection parameters

Supported Features

Materializations

Incremental Materialization

Commands

Custom configurations

General

Models

Table

Seeds

Grants

Cross DB macros

Compatibility

examples for cross DB macros

datediff

hash

last_day

Common Teradata-specific tasks

Support for dbt-utils package

Limitations

Transaction mode

Credits

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Support for `dbt-utils` package