Skip to main content

DBT extension for Metaflow

Project description

DBT extension for Metaflow

This extension adds support for executing DBT models as part of steps in Metaflow Flows via decorators.

Basic usage

Having a dbt_project.yml as part of your Flow project will allow executing dbt run as a pre-step to any task by simply adding the decorator to a step.

@dbt
@step
def start(self):
    # DBT Models have been run when step execution starts

    self.next(self.second_step)

If you only want to run a specific model as part of a step, you can specify this with the models= attribute

@dbt(models="customers")

Configuration options

Project directory

You might want to keep the DBT project separately nested within the Flow project. In these cases you would need to specify the location of the DBT project folder, due to the way project lookup works. This can be done by specifying the project location as a relative or absolute path within the decorator

@dbt(models="customers", project_dir="./dbt_project")

Supplying credentials

When deploying a DBT flow to be executed remotely, we do not want to bundle up sensitive credentials into the code package. Therefore a plain text profiles.yml will not suffice. We can utilize the environment variable replacement that DBT offers to get around this.

example profiles:

dbt_decorator:
  outputs:
    dev:
      type: postgres
      threads: 1
      host: localhost
      port: 5432
      user: "{{ env_var('DBT_POSTGRES_USER') }}"
      pass: "{{ env_var('DBT_POSTGRES_PW') }}"
      dbname: dbt_decorator
      schema: dev_jaffle_schema

    prod:
      type: postgres
      threads: 1
      host: localhost
      port: 5432
      user: "{{ env_var('DBT_POSTGRES_USER') }}"
      pass: "{{ env_var('DBT_POSTGRES_PW') }}"
      dbname: dbt_decorator
      schema: prod_jaffle_schema

  target: dev

Note: any profiles.yml in the flow project folder will be packaged, so make sure that they do not contain sensitive secrets.

We can supply the environment variables in various ways, for example

  • having them already present in the execution environment
  • supplying them with the @environment decorator in the flow (this still ends up bundling secrets into the package, but is good for testing)
  • hydrating environment variables with the @secrets decorator from a secret manager.

Examples

Check out the example flows in the /examples folder for detailed usage.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metaflow-dbt-extension-1.0.0.tar.gz (19.9 kB view details)

Uploaded Source

Built Distribution

metaflow_dbt_extension-1.0.0-py2.py3-none-any.whl (22.7 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file metaflow-dbt-extension-1.0.0.tar.gz.

File metadata

File hashes

Hashes for metaflow-dbt-extension-1.0.0.tar.gz
Algorithm Hash digest
SHA256 fbb0be16152e7a73d35a746084e6095e42622507efcc81099fc990d1452ae524
MD5 842b90191cdea18e3d4573362eca6824
BLAKE2b-256 ac347bfad26fae547fbdfa59655064881daf3419d99d9f7ed8e00bf4ae664162

See more details on using hashes here.

File details

Details for the file metaflow_dbt_extension-1.0.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for metaflow_dbt_extension-1.0.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 af49997ec6d30821ac099a50611225682c4c25d08abdcb1be57a06512d40658e
MD5 c6fb3a51a0e6987717288f5e2f49be7c
BLAKE2b-256 f071b771c79c2a1c2b1e3e298f81e2d4bfecef44740e5a9759f5a80d5c940520

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page