Skip to main content

Environment diff tool for dbt

Project description

recce

recce is a environment diff tool for dbt

Features

  1. Support the same dbt adapter framework as dbt.
  2. Support both Web UI & CLI
  3. Lineage diff

Use cases

  1. When developing, we can check the new result by comparing against the production one.
  2. When reviewing PR, you can understand the change impacts.
  3. When trouble shooting, you can run adhoc dif query to find the root causes.

Usage

Prerequisites

You have to have at least two environments in your dbt project. For example, one is for developing and another is for production. You can prepare two targets with separate schemas in you dbt profile. Here is profiles.yml example

jaffle_shop:
  target: dev
  outputs:
    dev:
      type: duckdb
      path: jaffle_shop.duckdb
      schema: dev
    prod:
      type: duckdb
      path: jaffle_shop.duckdb
      schema: main

Getting Started

  1. Installation

    git clone git@github.com:InfuseAI/recce.git
    cd recce
    pip install -e .
    
  2. Put the manifest.json of production (or any environment you would like to diff) in the target-base/ folder. manifest.json is one of the generated artifacts for each dbt command execution. You can find it in target/ folder by default.

  3. Develop your awesome features

    dbt run
    
  4. Run the recce command

    recce server
    
  5. Review the linage diff.

  6. Switch to query tab, Write and run a query

    select * from {{ ref('mymodel') }}
    

    where ref is a Jinja macro to reference a model name.

Under the hood, recce uses the manifest.json under target/ and target-base/ to geenrate query and execute.

Run Query Diff

You can either run in Web UI

recce server

or run in CLI

recce diff --sql 'select * from {{ ref('mymodel') }}'

Specify the primary key columns

In the query diff, we use primary key columns as the basis for identifying the same record on both sides.

There are two ways to specify the primary key

  1. Define in the SQL: Add the config macro in your sql.

    {{
       config( primary_key=['DATE_WEEK', 'COUNTRY'])
    }}
    
    select ...
    
  2. Select in the query result: In the Web UI, you can click the key icons in the column headers to toggle if a column is a primary key.

Q&A

Q: How recce connect to my data warehouse? Does recce support my data warehouse?

recce use the dbt adapter to connect to your warehouse. So it should work for your data warehouse.

Q: What credential does recce connect to the two environments?

Recce uses the same target in the profile to connect your warehouse. If you use the default target dev, it use the credentials to connect to both environments. So please make sure that the credential able to access both environments.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

recce-nightly-0.1.0.20231126.tar.gz (584.4 kB view details)

Uploaded Source

Built Distribution

recce_nightly-0.1.0.20231126-py3-none-any.whl (592.8 kB view details)

Uploaded Python 3

File details

Details for the file recce-nightly-0.1.0.20231126.tar.gz.

File metadata

  • Download URL: recce-nightly-0.1.0.20231126.tar.gz
  • Upload date:
  • Size: 584.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.13

File hashes

Hashes for recce-nightly-0.1.0.20231126.tar.gz
Algorithm Hash digest
SHA256 a42399373b8bd4077f09b69411a417d668e200a55e7f0d59de50bb7250cab7d0
MD5 5268f417f1cd30165f3665598f7a061f
BLAKE2b-256 9259612c9292021446ff37ab6446b13b0a52c321732ae4209cac9485f8dc0a47

See more details on using hashes here.

File details

Details for the file recce_nightly-0.1.0.20231126-py3-none-any.whl.

File metadata

File hashes

Hashes for recce_nightly-0.1.0.20231126-py3-none-any.whl
Algorithm Hash digest
SHA256 87fcaced3bc048d3564d666c626f007556cb7c90fe23fefaa6a3b41e1fee16a6
MD5 a1cc66b33d8c1f6c0f34255c245f986a
BLAKE2b-256 9eed04af0af91c24d253c40ea7cb30e83d931c62c5ba3263fa687d3ad3941f6d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page