Skip to main content

Lotad helps you identify schema changes, data differences, and structural modifications between database versions.

Project description

lotad

A Python library for tracking data drift between DuckDB databases. Helps identify schema changes, differences in data, and structural modifications between versions. Built as an exploratory tool with minimal setup required. Particularly useful for assessing downstream pipeline impacts.

Features

  • Compare schemas and data between DuckDB databases
  • Write changes to dedicated tables matching original schemas for easy visualization
  • No primary key requirement
  • Support for string-encoded and url-encoded JSON sorting
  • Detect missing tables, columns and type mismatches
  • Analyze row differences with consistent hashing
  • Generate detailed comparison reports
  • Configure excluded/included tables with regex support
  • Specify excluded columns for each table

Quick Start

Install

Must be 3.12+

pip install lotad

How to use

# Create a config file to quickly re-run the same diff check on 2 databases
lotad setup

# Or pass in the config to make changes via the wizard
# Alternatively, you can just alter the config directly
lotad setup --config lotad_config.yaml

# To perform the diff check
lotad run --config lotad_config.yaml

# Or you can pass in a subset of the config params directly to the run command.
lotad run --help

Checking results

A DuckDB file is created in the path set in the config but defaults to drift_analysis.db in the current directory if not set in the config.

For each table with data drift a table will be created within it. The generated table will contain the combined schema of the 2 dbs plus the following metadata columns generated by lotad.

  • observed_in the db the row was in
  • hashed_row a hash based representation of the row excluding ignored columns

These tables will also be created which contain summary level information

  • lotad_db_data_drift_summary
  • lotad_missing_table_drift
  • lotad_table_schema_drift

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lotad-0.2.1.tar.gz (52.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lotad-0.2.1-py3-none-any.whl (29.5 kB view details)

Uploaded Python 3

File details

Details for the file lotad-0.2.1.tar.gz.

File metadata

  • Download URL: lotad-0.2.1.tar.gz
  • Upload date:
  • Size: 52.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.6.9

File hashes

Hashes for lotad-0.2.1.tar.gz
Algorithm Hash digest
SHA256 8ce8ed96cb6e0a1b313b57178688d4491c2722cdc7bfcb5b8b44334866cd476b
MD5 58aec696b78f595c5e61356faf4e707c
BLAKE2b-256 84082b0c786f3d9c06262d751932550e29706460ce3f0a35e49c3587e2670f81

See more details on using hashes here.

File details

Details for the file lotad-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: lotad-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 29.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.6.9

File hashes

Hashes for lotad-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0145ae51fa4cb1cf0e30b12801d1c735c1c6fc2bfe91df2ac3dd425ff247ab0c
MD5 3111d408cddc99110a6e758820439944
BLAKE2b-256 aa6dd8658fd8bc7b978cb1188d7015fd12158f73794d0603e2b1da96531a6f77

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page