Skip to main content

The TimescaleDB adapter plugin for dbt

Project description

dbt logo timescale logo

pdm-managed PyPI - Version PyPI - License tests Coverage Status

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

dbt is the T in ELT. Organize, cleanse, denormalize, filter, rename, and pre-aggregate the raw data in your warehouse so that it's ready for analysis.

TimescaleDB

Timescale extends PostgreSQL for all of your resource-intensive production workloads, so you can build faster, scale further, and stay under budget.

dbt-timescaledb features & documentation

Installation

Install the package using pip:

pip install dbt-timescaledb

In your profiles.yml, use the same configuration as you'd do for a regular PostgreSQL database. The only difference is that you need to set the type to timescaledb.

company-name:
  target: dev
  outputs:
    dev:
      type: timescaledb # only option different from regular dbt-postgres
      host: [hostname]
      user: [username]
      password: [password]
      port: [port]
      dbname: [database name]
      schema: [dbt schema]
      # see dbt-postgres docs linked above for more options

Hypertables

You can materialize your models as hypertables. The hypertable materialization requires you to set the time_column_name configuration option in your models. This will create a hypertable in TimescaleDB.

{{
  config(
    materialized='hypertable',
    time_column_name='time_column'
  )
}}
select current_timestamp as time_column

As with any dbt model configuration, you can also set this in YAML (docs):

models:
  your_project_name:
    folder_containing_the_hypertables:
      +materialized: hypertable
        model_one:
          +time_column_name: time_column
        model_two:
          +time_column_name: time_column_name_in_model_two
# ...

All other TimescaleDB hypertable configuration options are supported through model configuration as well:

  • time_column_name
  • partitioning_column
  • number_partitions
  • chunk_time_interval
  • create_default_indexes (boolean)
  • partitioning_func
  • associated_schema_name
  • associated_table_prefix
  • time_partitioning_func
  • replication_factor
  • data_nodes (list of strings)
  • distributed (boolean)

Hypertable compression

You can also configure hypertable compression options:

{{
  config(
    materialized='hypertable',
    time_column_name='time_column',
    compression=True
  )
}}
select current_timestamp as time_column

or in YAML:

models:
  your_project_name:
    folder_containing_the_hypertables:
      +materialized: hypertable
# ...

or

{{
  config(
    materialized='hypertable',
    time_column_name='time_column',
    compression={
      chunk_time_interval='1 day',
    }
  )
}}
select current_timestamp as time_column

or YAML examples:

models:
  your_project_name:
    folder_containing_the_hypertables:
      +materialized: hypertable
      +compression: false # same as leaving this out, default value
        model_one:
          +time_column_name: time_column
          +compression: true
        model_two:
          +time_column_name: time_column_name_in_model_two
          +compression:
            chunk_time_interval: '1 day'
            orderby: 'another_column'
            segmentby: ['column_one', 'column_two']
# ...

The following compression options are supported:

  • orderby (string)
  • segmentby (list of strings)
  • chunk_time_interval (the actual interval, not prefixed with "interval")

Continuous aggregates

There is support for continuous_aggregate materialization. This materialization will create a continuous aggregate in TimescaleDB.

{{
  config(
    materialized='continuous_aggregate',
  )
}}
select
    count(*),
    time_bucket(interval '1 day', time_column) as bucket
from {{ ref('a_hypertable') }}
group by 2

More

Feel free to request things you're interested in by creating an issue.

The following things are planned:

  • Basic hypertable support
  • Basic continuous aggregate support
  • Configure continuous aggregate policies through dbt

License

MIT License. See LICENSE for full details.

Code of Conduct

Everyone interacting in the dbt project's codebases, issue trackers, chat rooms, and mailing lists is expected to follow the dbt Code of Conduct.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_timescaledb-1.7.0a3.tar.gz (10.4 kB view hashes)

Uploaded Source

Built Distribution

dbt_timescaledb-1.7.0a3-py3-none-any.whl (10.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page