Skip to main content

Environment diff tool for dbt

Project description

Recce

install pipy Python downloads license

InfuseAI Discord Invite

recce is an environment diff tool for DBT projects. It helps you to compare the results of two environments, such as development and production, and identify the differences.

Features

  1. Support both Web UI & CLI
  2. Multiple diff tools, including lineage diff, schema diff, and query diff. And more in the future.
  3. Use the dbt-core adapter framework to connect to your data warehouse. No additional configuration is required.

Use cases

  1. During development, we can verify new results by contrasting them with those from production prior to pushing the changes.
  2. While reviewing PR, you can grasp the extent of the changes and their impact before merging.
  3. For troubleshooting, you can execute ad-hoc diff queries to pinpoint the root causes.

Usage

Prerequisites

You have to have at least two environments in your dbt project. For example, one is for development and another is for production. You can prepare two targets with separate schemas in your DBT profile. Here is profiles.yml example

jaffle_shop:
  target: dev
  outputs:
    dev:
      type: duckdb
      path: jaffle_shop.duckdb
      schema: dev
    prod:
      type: duckdb
      path: jaffle_shop.duckdb
      schema: main

Getting Started

5 minutes walkthrough by jaffle shop example

  1. Installation

    pip install recce
    
  2. Go to your DBT project

    cd your-dbt-project/
    
  3. Prepare base artifacts: DBT generates artifacts when every invocation. You can find these files in the target/ folder.

    artifacts DBT command
    manifest.json dbt run, dbt build, ..
    catalog.json (optional) dbt docs generate

    Copy the artifacts for the base environment to target-base/ folder.

  4. Run the recce server.

    recce server
    

    Recce would diff environments between target/ and target-base/

Query Diff

You can run query diff in both Web UI and CLI

  • Web UI: Go to Query tab

    select * from {{ ref("mymodel") }}
    
  • CLI:

    recce diff --sql 'select * from {{ ref("mymodel") }}'
    

Primay key

In the query diff, primary key columns serve as the fundamental identifiers for distinguishing each record uniquely across both sides.

  • Web UI: In the query result, click the key icons in the column headers to toggle if it is in the primary key list.

  • CLI: Use the option --primary-keys to specify the primary keys. Use a comma to separate the columns if it is a compound key.

    recce diff --primary-keys event_id --sql 'select * from {{ ref("events") }} order by 1'
    

Q&A

Q: How recce connect to my data warehouse? Does recce support my data warehouse?

recce use the dbt adapter to connect to your warehouse. So it should work for your data warehouse.

Q: What credential does recce connect to the two environments?

Recce uses the same target in the profile to connect your warehouse. If you use the default target dev, it uses the credentials to connect to both environments. So please make sure that the credential can access both environments.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

recce-nightly-0.1.0.20231218.tar.gz (569.8 kB view details)

Uploaded Source

Built Distribution

recce_nightly-0.1.0.20231218-py3-none-any.whl (577.8 kB view details)

Uploaded Python 3

File details

Details for the file recce-nightly-0.1.0.20231218.tar.gz.

File metadata

  • Download URL: recce-nightly-0.1.0.20231218.tar.gz
  • Upload date:
  • Size: 569.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.13

File hashes

Hashes for recce-nightly-0.1.0.20231218.tar.gz
Algorithm Hash digest
SHA256 b7b75f6061b51ecea2b74e70fc07f2f9bea3b9b0d49f715dc0ac62c726bab854
MD5 abf8e7f9989b60a7b44dcc5b7b1e25ec
BLAKE2b-256 5db7e86d73cff8da87790fe9c345e64e9638901f6db4cf2e10655aafa0323f9b

See more details on using hashes here.

File details

Details for the file recce_nightly-0.1.0.20231218-py3-none-any.whl.

File metadata

File hashes

Hashes for recce_nightly-0.1.0.20231218-py3-none-any.whl
Algorithm Hash digest
SHA256 6f4d7a8a636ff325bdbd89b76c643753f2478c22fbf20b10084518e71afb6c84
MD5 0fc349d730cf1b7680d926ee44e08fd7
BLAKE2b-256 d8f1fd0ab4434bd0fed9bb9e7c34e2a370a1da285889d59841b539ebab02db58

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page