Lotad helps you identify schema changes, data differences, and structural modifications between database versions.
Project description
lotad
A Python library for tracking data drift between DuckDB databases. Helps identify schema changes, differences in data, and structural modifications between versions. Built as an exploratory tool with minimal setup required. Particularly useful for assessing downstream pipeline impacts.
Features
- Compare schemas and data between DuckDB databases
- Write changes to dedicated tables matching original schemas for easy visualization
- No primary key requirement
- Support for string-encoded and url-encoded JSON sorting
- Detect missing tables, columns and type mismatches
- Analyze row differences with consistent hashing
- Generate detailed comparison reports
- Configure excluded/included tables with regex support
- Specify excluded columns for each table
Quick Start
Install
Must be 3.12+
pip install lotad
How to use
# Create a config file to quickly re-run the same diff check on 2 databases
lotad setup --config lotad_config.yaml
# To perform the diff check
lotad run --config lotad_config.yaml
# Or you can pass in a subset of the config params directly to the run command.
lotad run --help
Checking results
A DuckDB file is created in the path set in the config
but defaults to drift_analysis.db in the current directory if not set in the config.
For each table with data drift a table will be created within it. The generated table will contain the combined schema of the 2 dbs plus the following metadata columns generated by lotad.
observed_inthe db the row was inhashed_rowa hash based representation of the row excluding ignored columns
These tables will also be created which contain summary level information
lotad_db_data_drift_summarylotad_missing_table_driftlotad_table_schema_drift
License
This project is licensed under the MIT License.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lotad-0.2.0.tar.gz.
File metadata
- Download URL: lotad-0.2.0.tar.gz
- Upload date:
- Size: 49.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.6.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
63747a6d50593259baa2bfdeb92cebb258a4c32c09b1cff2cc6ae4ea9cc4e2c9
|
|
| MD5 |
a549a071ddb9f42734abc1a7ed3e96b1
|
|
| BLAKE2b-256 |
38edc51998c83f0db9e10bc808f29eb0f25230c4a1f86f9086b790c631dfa734
|
File details
Details for the file lotad-0.2.0-py3-none-any.whl.
File metadata
- Download URL: lotad-0.2.0-py3-none-any.whl
- Upload date:
- Size: 25.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.6.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00aef90f7279b05bfbb17291fd1bb727653f10493735e581654ec0e623e6e964
|
|
| MD5 |
91d3335e064d8223d83df839acb8433e
|
|
| BLAKE2b-256 |
749b3a6236376bd42de1c4402db9943b3f0c07fff2bf1dcf8557d739a65ef52b
|