Skip to main content

CLI tool for dbt users adopting analytics engineering best practices.

Project description

dbt-coves

Maintenance PyPI version fury.io Code Style Checked with mypy Imports: isort Imports: python Build pre-commit.ci status codecov Maintainability Downloads

What is dbt-coves?

dbt-coves is a complimentary CLI tool for dbt that allows users to quickly apply Analytics Engineering best practices.

dbt-coves helps with the generation of scaffold for dbt by analyzing your data warehouse schema in Redshift, Snowflake, or Big Query and creating the necessary configuration files (sql and yml).

⚠️ dbt-coves is in alpha version. Don't use on your prod models unless you have tested it before.

Here's the tool in action

image

Supported dbt versions

Version Status
<= 0.17.0 ❌ Not supported
0.18.x - 0.21x ✅ Tested
1.x ✅ Tested

Supported adapters

Feature Snowflake Redshift BigQuery Postgres
profile.yml generation ✅ Tested 🕥 In progress ❌ Not tested ❌ Not tested
sources generation ✅ Tested 🕥 In progress ❌ Not tested ❌ Not tested

Installation

pip install dbt-coves

We recommend using python virtualenvs and create one separate environment per project.

⚠️ if you have dbt < 0.18.0 installed, dbt-coves will automatically upgrade dbt to the latest version

Main Features

Project initialization

dbt-coves init

Initializes a new ready-to-use dbt project that includes recommended integrations such as sqlfluff, pre-commit, dbt packages, among others.

Uses a cookiecutter template to make it easier to maintain.

Models generation

dbt-coves generate <resource>

Where <resource> could be sources or properties.

Code generation tool to easily generate models and model properties based on configuration and existing data.

Supports Jinja templates to adjust how the resources are generated.

Metadata

Supports the argument --metadata which allows to specify a csv file containing field types and descriptions to be inserted into the model property files.

dbt-coves generate sources --metadata metadata.csv

Metadata format:

database schema relation column key type description
raw master person name (empty) varchar The full name
raw master person name groupName varchar The group name

Quality Assurance

dbt-coves check

Runs a set of checks in your local environment to ensure high code quality.

Checks can be extended by implementing pre-commit hooks.

Environment setup

Setting up your environment can be done in two different ways:

dbt-coves setup all

Runs a set of checks in your local environment and helps you configure every project component properly: ssh key, git, dbt profiles.yml, vscode extensions, sqlfluff and precommit.

You can also configure individual components:

dbt-coves setup git

Set up Git repository of dbt-coves project

dbt-coves setup dbt

Setup dbt within the project (delegates to dbt init)

dbt-coves setup ssh

Set up SSH Keys for dbt-coves project. Supports the argument --open_ssl_public_key which generates an extra Public Key in Open SSL format, useful for configuring certain providers (i.e. Snowflake authentication)

dbt-coves setup vscode

Setup of predefined settings.json for vscode, settings.json may be added to .dbt_coves/templates/ folder

dbt-coves setup sqlfluff

Set up sqlfluff of dbt-coves project. Supports --templates argument for using your custom .sqlfluff configuration file

dbt-coves setup precommit

Setup of default pre-commit template of dbt-coves project. Supports --templates argument for using your custom .pre-commit-config.yaml configuration file

Extract configuration from Airbyte

dbt-coves extract airbyte

Extracts the configuration from your Airbyte sources, connections and destinations (excluding credentials) and stores it in the specified folder. The main goal of this feature is to keep track of the configuration changes in your git repo, and rollback to a specific version when needed.

Load configuration to Airbyte

dbt-coves load airbyte

Loads the Airbyte configuration generated with dbt-coves extract airbyte on an Airbyte server. Secrets folder needs to be specified separatedly. You can use git-secret to encrypt them and make them part of your git repo.

Settings

Dbt-coves could optionally read settings from .dbt_coves.yml or .dbt_coves/config.yml. A standard settings files could looke like this:

generate:
  sources:
    schemas:
      - RAW
    destination: "models/sources/{{ schema }}/{{ relation }}.sql"
    model_props_strategy: one_file_per_model
    templates_folder: ".dbt_coves/templates"

In this example options for the generate command are provided:

schemas: List of schema names where to look for source tables

destination: Path to generated model, where schema represents the lowercased schema and relation the lowercased table name.

model_props_strategy: Defines how dbt-coves generates model properties files, currently just one_file_per_model is available, creates one yaml file per model.

templates_folder: Folder where source generation jinja templates are located.

Override source generation templates

Customizing generated models and model properties requires placing specific files under the templates_folder folder like these:

source_model.sql

with raw_source as (

    select
        *
    from {% raw %}{{{% endraw %} source('{{ relation.schema.lower() }}', '{{ relation.name.lower() }}') {% raw %}}}{% endraw %}

),

final as (

    select
{%- if adapter_name == 'SnowflakeAdapter' %}
{%- for key, cols in nested.items() %}
  {%- for col in cols %}
        {{ key }}:{{ '"' + col + '"' }}::{{ cols[col]["type"] }} as {{ cols[col]["id"] }}{% if not loop.last or columns %},{% endif %}
  {%- endfor %}
{%- endfor %}
{%- elif adapter_name == 'BigQueryAdapter' %}
{%- for key, cols in nested.items() %}
  {%- for col in cols %}
        cast({{ key }}.{{ col }} as {{ cols[col]["type"].replace("varchar", "string") }}) as {{ cols[col]["id"] }}{% if not loop.last or columns %},{% endif %}
  {%- endfor %}
{%- endfor %}
{%- elif adapter_name == 'RedshiftAdapter' %}
{%- for key, cols in nested.items() %}
  {%- for col in cols %}
        {{ key }}.{{ col }}::{{ cols[col]["type"] }} as {{ cols[col]["id"] }}{% if not loop.last or columns %},{% endif %}
  {%- endfor %}
{%- endfor %}
{%- endif %}
{%- for col in columns %}
        {{ '"' + col['name'] + '"' }} as {{ col['id'] }}{% if not loop.last %},{% endif %}
{%- endfor %}

    from raw_source

)

select * from final

source_model_props.yml

version: 2

sources:
  - name: {{ relation.schema.lower() }}
{%- if source_database %}
    database: {{ source_database }}
{%- endif %}
    schema: {{ relation.schema.lower() }}
    tables:
      - name: {{ relation.name.lower() }}
        identifier: {{ relation.name }}

models:
  - name: {{ model.lower() }}
    columns:
{%- for cols in nested.values() %}
  {%- for col in cols %}
      - name: {{ cols[col]["id"] }}
      {%- if cols[col]["description"] %}
        description: "{{ cols[col]['description'] }}"
      {%- endif %}
  {%- endfor %}
{%- endfor %}
{%- for col in columns %}
      - name: {{ col['id'] }}
      {%- if col['description'] %}
        description: "{{ col['description'] }}"
      {%- endif %}
{%- endfor %}

model_props.yml

version: 2

models:
  - name: {{ model.lower() }}
    columns:
{%- for col in columns %}
      - name: {{ col['id'] }}
      {%- if col['description'] %}
        description: "{{ col['description'] }}"
      {%- endif %}
{%- endfor %}

model.yml

version: 2

models:
  - name: {{ model.lower() }}
    columns:
{%- for cols in nested.values() %}
  {%- for col in cols %}
      - name: {{ cols[col]["id"] }}
      {%- if cols[col]["description"] %}
        description: "{{ cols[col]['description'] }}"
      {%- endif %}
  {%- endfor %}
{%- endfor %}
{%- for col in columns %}
      - name: {{ col.name.lower() }}
{%- endfor %}

Thanks

The project main structure was inspired by dbt-sugar. Special thanks to Bastien Boutonnet for the great work done.

Authors

About

Learn more about Datacoves.

CLI Reference

For a complete detail of usage, please run:

dbt-coves -h
dbt-coves <command> -h

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbt_coves-1.1.1a5.tar.gz (45.9 kB view details)

Uploaded Source

Built Distribution

dbt_coves-1.1.1a5-py3-none-any.whl (57.2 kB view details)

Uploaded Python 3

File details

Details for the file dbt_coves-1.1.1a5.tar.gz.

File metadata

  • Download URL: dbt_coves-1.1.1a5.tar.gz
  • Upload date:
  • Size: 45.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.8.13 Linux/5.13.0-1031-azure

File hashes

Hashes for dbt_coves-1.1.1a5.tar.gz
Algorithm Hash digest
SHA256 84301d43bb4f16570a0f712ebce1e379eda00f5f068102bf97e0e86f41afbc8f
MD5 84df5be9aebd4615397b196d2ccb92a5
BLAKE2b-256 2664cf4b1b833e234fbf97aa08e4f9817271214ef4b16b8fc5541f7845971374

See more details on using hashes here.

File details

Details for the file dbt_coves-1.1.1a5-py3-none-any.whl.

File metadata

  • Download URL: dbt_coves-1.1.1a5-py3-none-any.whl
  • Upload date:
  • Size: 57.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.8.13 Linux/5.13.0-1031-azure

File hashes

Hashes for dbt_coves-1.1.1a5-py3-none-any.whl
Algorithm Hash digest
SHA256 ba4794f638394fb4a0059c8041761e44c95fb5353b20f934a020c271d613c0bf
MD5 1904de4af65438968c67cff9eea5e037
BLAKE2b-256 aaf7b31684327b2b7194318ae9325be752016400875e6d95b00383993aa68f9d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page