Skip to main content

Command-line tool and Python library to efficiently diff rows across two different databases.

Project description

Datafold

data-diff

Develop dbt models faster by testing as you code.

See how every change to dbt code affects the data produced in the modified model and downstream.


What is data-diff?

data-diff is an open source package that you can use to see the impact of your dbt code changes on your dbt models as you code.

development_testing_gif


Getting Started

Install data-diff

pip install data-diff

Update a few lines in your dbt_project.yml

#dbt_project.yml
vars:
  data_diff:
    prod_database: my_database
    prod_schema: my_default_schema

Run your first data diff!

dbt run && data-diff --dbt

We recommend you get started by walking through our simple setup instructions which contain examples and details.

Please reach out on the dbt Slack in #tools-datafold if you have any trouble whatsoever getting started!



Diffing between databases

Check out our documentation if you're looking to compare data across databases (for example, between Postgres and Snowflake).


Contributors

We thank everyone who contributed so far!


Analytics


License

This project is licensed under the terms of the MIT License.

Project details


Release history Release notifications | RSS feed

This version

0.7.5

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data_diff-0.7.5.tar.gz (90.0 kB view hashes)

Uploaded Source

Built Distribution

data_diff-0.7.5-py3-none-any.whl (121.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page