Skip to main content

Command-line tool and Python library to efficiently diff rows across two different databases.

Project description

Datafold

data-diff

Develop dbt models faster by testing as you code.

See how every change to dbt code affects the data produced in the modified model and downstream.


What is data-diff?

data-diff is an open source package that you can use to see the impact of your dbt code changes on your dbt models as you code.

development_testing_gif


:eyes: Watch 4-min demo video here

Getting Started

Install data-diff

Install data-diff with the command that is specific to the database you use with dbt.

Snowflake

pip install data-diff 'data-diff[snowflake,dbt]' -U

BigQuery

pip install data-diff 'data-diff[dbt]' google-cloud-bigquery -U

Redshift

pip install data-diff 'data-diff[redshift,dbt]' -U

Postgres

pip install data-diff 'data-diff[postgres,dbt]' -U

Databricks

pip install data-diff 'data-diff[databricks,dbt]' -U

DuckDB

pip install data-diff 'data-diff[duckdb,dbt]' -U

Update a few lines in your dbt_project.yml.

#dbt_project.yml
vars:
  data_diff:
    prod_database: my_database
    prod_schema: my_default_schema

Run your first data diff!

dbt run && data-diff --dbt

We recommend you get started by walking through our simple setup instructions which contain examples and details.

Please reach out on the dbt Slack in #tools-datafold if you have any trouble whatsoever getting started!



Diffing between databases

Check out our documentation if you're looking to compare data across databases (for example, between Postgres and Snowflake).


Contributors

We thank everyone who contributed so far!


Analytics


License

This project is licensed under the terms of the MIT License.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data_diff-0.7.10.tar.gz (96.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

data_diff-0.7.10-py3-none-any.whl (128.4 kB view details)

Uploaded Python 3

File details

Details for the file data_diff-0.7.10.tar.gz.

File metadata

  • Download URL: data_diff-0.7.10.tar.gz
  • Upload date:
  • Size: 96.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.1 CPython/3.10.10 Darwin/22.5.0

File hashes

Hashes for data_diff-0.7.10.tar.gz
Algorithm Hash digest
SHA256 4068ae6d66d4b8e053a4fe97f4d8ee00c1c063984e5e7a8a0fe8634631813242
MD5 ea70919e4a7d637787f0e8cd173bfb71
BLAKE2b-256 619982fe02cc0f63b26c82bbd50f7fcd79131d2e3759b943a0cac91381b5202f

See more details on using hashes here.

File details

Details for the file data_diff-0.7.10-py3-none-any.whl.

File metadata

  • Download URL: data_diff-0.7.10-py3-none-any.whl
  • Upload date:
  • Size: 128.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.1 CPython/3.10.10 Darwin/22.5.0

File hashes

Hashes for data_diff-0.7.10-py3-none-any.whl
Algorithm Hash digest
SHA256 1d78d57cc5f634868028e6a749dbec30377cc9e84bbbe0bb9f270a1a65f63a43
MD5 20eb09c6de91976cfcfcb4faabc7710d
BLAKE2b-256 2d582abfe6cb188e02b198d60772227ef18e87722142286eca545da000aa31c8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page