Skip to main content

Add your description here

Project description

ddrift analyzes the differences between data sources, i.e. how much has dataset a drifted from dataset b. The framework is engine agnostic. Each engine is required to comply with simple abstract protocols in order to enable the standard reporting.

Engines Supported:

  • DuckDB

Engines to be Supported:

  • Postgres
  • Polars
  • Pandas

Install with pip install ddrift or (prefferebly) uv pip install ddrift

Getting Started

Let's create 2 simple tables and compare them to one another. The fundamental question we're asking is "How much has table2 drifted from table1?"

import duckdb

with duckdb.connect() as con:
    con.execute("CREATE TABLE table1 (city VARCHAR, state VARCHAR)")
    con.execute(
        "INSERT INTO table1 VALUES ('New York', 'NY'), ('Los Angeles', 'CA'), ('Chicago', 'IL')"
    )

    con.execute("CREATE TABLE table2 (city VARCHAR, state VARCHAR)")
    con.execute(
        "INSERT INTO table2 VALUES ('New York', 'NY'), ('Phoenix', 'AZ'), ('Philadelphia', 'PA')"
    )

    sql = SQLComparator(df1="table1", df2="table2", con=con)
    sql.comp_freq(vars=("city", "state"))

    print(sql.results) # prints a list of result objects containing an in memory representation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ddrift-0.1.0.tar.gz (24.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ddrift-0.1.0-py3-none-any.whl (3.8 kB view details)

Uploaded Python 3

File details

Details for the file ddrift-0.1.0.tar.gz.

File metadata

  • Download URL: ddrift-0.1.0.tar.gz
  • Upload date:
  • Size: 24.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.2

File hashes

Hashes for ddrift-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5363c7df2dd9a38fe1884d09971ee740477d3ec0f703e6d8dd63094e48ec0671
MD5 4154db6153b519057c7efa66e433a4e5
BLAKE2b-256 06e774e630c0cd0df2867d35782dda728439954f156cc93f37bb0ee01a63b309

See more details on using hashes here.

File details

Details for the file ddrift-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ddrift-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 3.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.2

File hashes

Hashes for ddrift-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 02705b39f332cc8e0da4672c0c99dc461f295dcb8b4d0e4535250a0bff299566
MD5 f2bcab085b873604662475a258ea2960
BLAKE2b-256 7d3ffe93b656c197c0c4632114e732cc40b7686f870a685292da03432f5f683c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page