The athena adapter plugin for dbt (data build tool)

Project description

dbt-athena

Supports dbt version 1.3.*
Supports Seeds
Correctly detects views and their columns
Support incremental models
- On iceberg tables :
  - Support the use of unique_key only with query engine v3 using the merge strategy
  - Support the append strategy
- On Hive tables :
  - Support two incremental update strategies: insert_overwrite and append
  - Does not support the use of unique_key
Does not support Python models

Installation

pip install dbt-athena-community
Or pip install git+https://github.com/dbt-athena/dbt-athena.git

Prerequisites

To start, you will need an S3 bucket, for instance my-staging-bucket and an Athena database:

CREATE DATABASE IF NOT EXISTS analytics_dev
COMMENT 'Analytics models generated by dbt (development)'
LOCATION 's3://my-staging-bucket/'
WITH DBPROPERTIES ('creator'='Foo Bar', 'email'='foo@bar.com');

Notes:

Take note of your AWS region code (e.g. us-west-2 or eu-west-2, etc.).
You can also use AWS Glue to create and manage Athena databases.

Credentials

This plugin does not accept any credentials directly. Instead, credentials are determined automatically based on aws cli/boto3 conventions and stored login info. You can configure the AWS profile name to use via aws_profile_name. Checkout DBT profile configuration below for details.

Configuring your profile

A dbt profile can be configured to run against AWS Athena using the following configuration:

Option	Description	Required?	Example
s3_staging_dir	S3 location to store Athena query results and metadata	Required	`s3://bucket/dbt/`
s3_data_dir	Prefix for storing tables, if different from the connection's `s3_staging_dir`	Optional	`s3://bucket2/dbt/`
s3_data_naming	How to generate table paths in `s3_data_dir`	Optional	`schema_table_unique`
region_name	AWS region of your Athena instance	Required	`eu-west-1`
schema	Specify the schema (Athena database) to build models into (lowercase only)	Required	`dbt`
database	Specify the database (Data catalog) to build models into (lowercase only)	Required	`awsdatacatalog`
poll_interval	Interval in seconds to use for polling the status of query results in Athena	Optional	`5`
aws_profile_name	Profile to use from your AWS shared credentials file.	Optional	`my-profile`
work_group	Identifier of Athena workgroup	Optional	`my-custom-workgroup`
num_retries	Number of times to retry a failing query	Optional	`3`

Example profiles.yml entry:

athena:
  target: dev
  outputs:
    dev:
      type: athena
      s3_staging_dir: s3://athena-query-results/dbt/
      s3_data_dir: s3://your_s3_bucket/dbt/
      s3_data_naming: schema_table
      region_name: eu-west-1
      schema: dbt
      database: awsdatacatalog
      aws_profile_name: my-profile
      work_group: my-workgroup

Additional information

threads is supported
database and catalog can be used interchangeably

Usage notes

Models

Table Configuration

external_location (default=none)
- If set, the full S3 path in which the table will be saved.
partitioned_by (default=none)
- An array list of columns by which the table will be partitioned
- Limited to creation of 100 partitions (currently)
bucketed_by (default=none)
- An array list of columns to bucket data
bucket_count (default=none)
- The number of buckets for bucketing your data
format (default='parquet')
- The data format for the table
- Supports ORC, PARQUET, AVRO, JSON, or TEXTFILE
write_compression (default=none)
- The compression type to use for any storage format that allows compression to be specified. To see which options are available, check out [CREATE TABLE AS][create-table-as]
field_delimiter (default=none)
- Custom field delimiter, for when format is set to TEXTFILE
table_properties: table properties to add to the table, valid for Iceberg only
strict_location (default=True): when working with iceberg it's possible to rename tables, in order to do so, tables need to avoid to have same location. Setting up strict_location to false allow a table creation on an unique location

Table location

The location in which a table is saved is determined by:

If external_location is defined, that value is used.
If s3_data_dir is defined, the path is determined by that and s3_data_naming
If s3_data_dir is not defined data is stored under s3_staging_dir/tables/

Here all the options available for s3_data_naming:

uuid: {s3_data_dir}/{uuid4()}/
table_table: {s3_data_dir}/{table}/
table_unique: {s3_data_dir}/{table}/{uuid4()}/
schema_table: {s3_data_dir}/{schema}/{table}/
s3_data_naming=schema_table_unique: {s3_data_dir}/{schema}/{table}/{uuid4()}/

It's possible to set the s3_data_naming globally in the target profile, or overwrite the value in the table config, or setting up the value for groups of model in dbt_project.yml

Incremental models

Support for incremental models.

These strategies are supported:

insert_overwrite
append

Note: unique_key is not supported.

on_schema_change is an option to reflect changes of schema in incremental models. The following options are supported:

ignore (default)
fail
append_new_columns
sync_all_columns

In detail, please refer to dbt docs.

Iceberg

The adapter support table materialization for Iceberg.

To get started just add this as your model:

{{ config(
    materialized='table',
    format='iceberg',
    partitioned_by=['bucket(5, user_id)'],
    table_properties={
    	'optimize_rewrite_delete_file_threshold': '2'
    	}
) }}

SELECT
	'A' AS user_id,
	'pi' AS name,
	'active' AS status,
	17.89 AS cost,
	1 AS quantity,
	100000000 AS quantity_big,
	current_date AS my_date

Iceberg support bucketing as hidden partitions, therefore use the partitioned_by config to add specific bucketing conditions.

Unsupported functionality

Due to the nature of AWS Athena, not all core dbt functionality is supported. The following features of dbt are not implemented on Athena:

Snapshots

Known issues

Quoting is not currently supported

If you need to quote your sources, escape the quote characters in your source definitions:

version: 2

sources:
  - name: my_source
    tables:
      - name: first_table
        identifier: "first table"       # Not like that
      - name: second_table
        identifier: "\"second table\""  # Like this

Tables, schemas and database should only be lowercase

Contributing

This connector works with Python from 3.7 to 3.10.

Getting started

In order to start developing on this adapter clone the repo and run this make command (see Makefile) :

make setup

It will :

Install all dependencies.
Install pre-commit hooks.
Generate your .env file

Next, adjust .env file by configuring the environment variables to match your Athena development environment.

Running tests

You must have an AWS account with Athena setup in order to launch the tests. You can run the tests using make:

make run_tests

Helpful Resources

Community

Project details

Release history Release notifications | RSS feed

1.10.0

Dec 22, 2025

1.10.0rc2 pre-release

Dec 10, 2025

1.10.0rc1 pre-release

Nov 19, 2025

1.9.5

Sep 5, 2025

1.9.4

Apr 28, 2025

1.9.3

Apr 7, 2025

1.9.2

Mar 7, 2025

1.9.0b1 pre-release

Oct 17, 2024

1.8.4

Aug 28, 2024

1.8.3

Jun 27, 2024

1.8.2

May 28, 2024

1.8.1

May 13, 2024

1.8.0rc1 pre-release

May 7, 2024

1.8.0b1 pre-release

Apr 15, 2024

1.7.2

Feb 6, 2024

1.7.1

Jan 5, 2024

1.7.0

Nov 10, 2023

1.6.4

Oct 30, 2023

1.6.3

Oct 10, 2023

1.6.2

Sep 27, 2023

1.6.1

Sep 11, 2023

1.6.0

Aug 22, 2023

1.5.2

Aug 16, 2023

1.5.1

Jun 21, 2023

1.5.0

May 15, 2023

1.4.5

May 4, 2023

1.4.4

May 3, 2023

1.4.3

Apr 17, 2023

1.4.2

Mar 31, 2023

1.4.1

Mar 7, 2023

1.4.0

Feb 9, 2023

1.3.5

Jan 20, 2023

1.3.4

Jan 4, 2023

1.3.3

Dec 8, 2022

This version

1.3.2

Nov 24, 2022

1.3.1

Nov 18, 2022

1.3.0

Nov 15, 2022

1.0.4

Nov 15, 2022

1.0.3

Oct 25, 2022

1.0.2

Oct 25, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt-athena-community-1.3.2.tar.gz (28.9 kB view details)

Uploaded Nov 24, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dbt_athena_community-1.3.2-py3-none-any.whl (36.4 kB view details)

Uploaded Nov 24, 2022 Python 3

File details

Details for the file dbt-athena-community-1.3.2.tar.gz.

File metadata

Download URL: dbt-athena-community-1.3.2.tar.gz
Upload date: Nov 24, 2022
Size: 28.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for dbt-athena-community-1.3.2.tar.gz
Algorithm	Hash digest
SHA256	`05cd26e8e0f9686d69e84462ee82dfb34990e0e1f03e12ee03e0738e17114766`
MD5	`6b3ea682456e000c79029cc675c305f6`
BLAKE2b-256	`3665b3e5be8ca7588a90da7865910729666e0510b5dffee29946a9d4c9329efb`

See more details on using hashes here.

File details

Details for the file dbt_athena_community-1.3.2-py3-none-any.whl.

File metadata

Download URL: dbt_athena_community-1.3.2-py3-none-any.whl
Upload date: Nov 24, 2022
Size: 36.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for dbt_athena_community-1.3.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5e819de9a17e2828e62d7829e65036deab2d88647b5181ae98917e65b33b2cc5`
MD5	`7d9540c55cfb4e7efc32841e72079ed1`
BLAKE2b-256	`3f4482c153d791b89cfc4b81d01a8e39cf8a5370dc5163f7ff3e80f678f22400`

See more details on using hashes here.

dbt-athena-community 1.3.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Project description

dbt-athena

Installation

Prerequisites

Credentials

Configuring your profile

Usage notes

Models

Table Configuration

Table location

Incremental models

Iceberg

Unsupported functionality

Known issues

Contributing

Getting started

Running tests

Helpful Resources

Community

Project details

Verified details

Maintainers

Unverified details

Project links

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes