Skip to main content

Finds and summarizes the differences between two datasets.

Project description

deltascan

Package version License

deltascan is a Python package that finds and summarizes the differences between two datasets.

Installation

pip install deltascan

Main Features

The DeltaScan class compares any two supported data structures accross one or more dimensions.

Data Structures:

  • DataFrame
  • Series
  • LazyFrame (Polars only)

Dimensions:

  • Rows → rows present in one dataset but missing in the other, aligned using join_on.
  • Columns → differences in column names and data types.
  • Values → mismatched values within matching rows and columns.

Example Usage

Imports

Import the DeltaScan class.

from deltascan import DeltaScan

Create DataFrames

Create two sample DataFrame objects to compare.

import pandas as pd
import polars as pl
import datetime


# February Data
left_data = pd.DataFrame({
    'id': [1, 2, 3, 4],
    'date': [pd.to_datetime('2026-02-28')] * 4,
    'first_name': ['Alice', 'Mike', 'John', 'Sarah'],
    'flag': [True, False, True, False],
    'amount': [10.0, 5.3, 33.7, 99.3],
    })

# January Data
right_data = pl.DataFrame({
    'id': [1, 3, 9],
    'date': [datetime.date(2026, 1, 31)] * 3,
    'first_name': ['Alice', 'Michael', 'Zachary'],
    'color': ['Pink', 'Blue', 'Red'],
    'last_name': ['Jones', 'Smith', 'Einck'],
    'flag': [False, True, False],
    'amount': [10, None, 14],
    })

Compare DataFrames

Create a DeltaScan instance to perform the comparison. See the in-code documentation for a complete list of available arguments.

ds = DeltaScan(
    left_data=left_data,
    right_data=right_data,
    join_on='id',
    left_alias='feb',
    right_alias='jan',
    left_context=['first_name'],
    right_context=None,
    verbose=True,
    )

Comparison Results

Access the comparison results using the summary and differences attributes.

print(ds.summary)
shape: (8, 6)
┌─────────────────────┬───────────┬─────────────┬─────────┬───────┬──────────────┐
│ Comparison          ┆ Dimension ┆ Differences ┆ Matches ┆ Total ┆ Match Rate % │
│ ---                 ┆ ---       ┆ ---         ┆ ---     ┆ ---   ┆ ---          │
│ str                 ┆ str       ┆ i64         ┆ i64     ┆ i64   ┆ f64          │
╞═════════════════════╪═══════════╪═════════════╪═════════╪═══════╪══════════════╡
│ jan cols not in feb ┆ columns   ┆ 2           ┆ 5       ┆ 7     ┆ 0.714286     │
│ data types          ┆ columns   ┆ 2           ┆ 3       ┆ 5     ┆ 0.6          │
│ feb rows not in jan ┆ rows      ┆ 2           ┆ 2       ┆ 4     ┆ 0.5          │
│ jan rows not in feb ┆ rows      ┆ 1           ┆ 2       ┆ 3     ┆ 0.666667     │
│ amount              ┆ values    ┆ 1           ┆ 1       ┆ 2     ┆ 0.5          │
│ date                ┆ values    ┆ 2           ┆ 0       ┆ 2     ┆ 0.0          │
│ first_name          ┆ values    ┆ 1           ┆ 1       ┆ 2     ┆ 0.5          │
│ flag                ┆ values    ┆ 1           ┆ 1       ┆ 2     ┆ 0.5          │
└─────────────────────┴───────────┴─────────────┴─────────┴───────┴──────────────┘

Export to Excel

Export the results to an excel file.

ds.to_excel()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deltascan-0.1.1.tar.gz (17.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deltascan-0.1.1-py3-none-any.whl (25.7 kB view details)

Uploaded Python 3

File details

Details for the file deltascan-0.1.1.tar.gz.

File metadata

  • Download URL: deltascan-0.1.1.tar.gz
  • Upload date:
  • Size: 17.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.3 CPython/3.12.10 Windows/11

File hashes

Hashes for deltascan-0.1.1.tar.gz
Algorithm Hash digest
SHA256 abe384c16be515558cf331a7277b0f3ea68065596728164691a8b2304bd5f3e8
MD5 fa88207f7c1276f3ba69e6edbb886ba9
BLAKE2b-256 cd579d1b5ec77da8488e471fc05896935585574b9eab491729dbe78d17f9bf41

See more details on using hashes here.

File details

Details for the file deltascan-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: deltascan-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 25.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.3 CPython/3.12.10 Windows/11

File hashes

Hashes for deltascan-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d9e6a974f6e9c4813e8ab4387c4116c79db4e63d4c2e83e810513658adee3030
MD5 5e7f368b2386deae0882e70ab4803420
BLAKE2b-256 f3c94e59f66ace9308e7d8189f5ca42bf0c2273f03caf51cdbe9cb36210bcf83

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page