Skip to main content

Command-line tool and Python library to efficiently diff rows across two different databases.

Project description

Datafold

data-diff

Develop dbt models faster by testing as you code.

See how every change to dbt code affects the data produced in the modified model and downstream.


What is data-diff?

data-diff is an open source package that you can use to see the impact of your dbt code changes on your dbt models as you code.

development_testing_gif


:eyes: Watch 4-min demo video here

Getting Started

Install data-diff

Install data-diff with the command that is specific to the database you use with dbt.

Snowflake

pip install data-diff 'data-diff[snowflake,dbt]' -U

BigQuery

pip install data-diff 'data-diff[dbt]' google-cloud-bigquery -U

Redshift

pip install data-diff 'data-diff[redshift,dbt]' -U

Postgres

pip install data-diff 'data-diff[postgres,dbt]' -U

Databricks

pip install data-diff 'data-diff[databricks,dbt]' -U

DuckDB

pip install data-diff 'data-diff[duckdb,dbt]' -U

Update a few lines in your dbt_project.yml.

#dbt_project.yml
vars:
  data_diff:
    prod_database: my_database
    prod_schema: my_default_schema

Run your first data diff!

dbt run && data-diff --dbt

We recommend you get started by walking through our simple setup instructions which contain examples and details.

Please reach out on the dbt Slack in #tools-datafold if you have any trouble whatsoever getting started!



Diffing between databases

Check out our documentation if you're looking to compare data across databases (for example, between Postgres and Snowflake).


Contributors

We thank everyone who contributed so far!


Analytics


License

This project is licensed under the terms of the MIT License.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data_diff-0.7.14.tar.gz (97.8 kB view hashes)

Uploaded Source

Built Distribution

data_diff-0.7.14-py3-none-any.whl (129.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page