Skip to main content

Finds and summarizes the differences between two datasets.

Project description

deltascan

Package version License

deltascan is a Python package that finds and summarizes the differences between two datasets.

Installation

pip install deltascan

Main Features

The DeltaScan class compares any two supported data structures accross one or more dimensions.

Data Structures:

  • DataFrame
  • Series
  • LazyFrame (Polars only)

Dimensions:

  • Rows → rows present in one dataset but missing in the other, aligned using join_on.
  • Columns → differences in column names and data types.
  • Values → mismatched values within matching rows and columns.

Example Usage

Imports

Import the DeltaScan class.

from deltascan import DeltaScan

Create DataFrames

Create two sample DataFrame objects to compare.

import pandas as pd
import polars as pl
import datetime


# February Data
left_data = pd.DataFrame({
    'id': [1, 2, 3, 4],
    'date': [pd.to_datetime('2026-02-28')] * 4,
    'first_name': ['Alice', 'Mike', 'John', 'Sarah'],
    'flag': [True, False, True, False],
    'amount': [10.0, 5.3, 33.7, 99.3],
    })

# January Data
right_data = pl.DataFrame({
    'id': [1, 3, 9],
    'date': [datetime.date(2026, 1, 31)] * 3,
    'first_name': ['Alice', 'Michael', 'Zachary'],
    'color': ['Pink', 'Blue', 'Red'],
    'last_name': ['Jones', 'Smith', 'Einck'],
    'flag': [False, True, False],
    'amount': [10, None, 14],
    })

Compare DataFrames

Create a DeltaScan instance to perform the comparison. See the in-code documentation for a complete list of available arguments.

ds = DeltaScan(
    left_data=left_data,
    right_data=right_data,
    join_on='id',
    left_alias='feb',
    right_alias='jan',
    left_context=['first_name'],
    right_context=None,
    verbose=True,
    )

Comparison Results

Access the comparison results using the summary and differences attributes.

print(ds.summary)
shape: (8, 6)
┌─────────────────────┬───────────┬─────────────┬─────────┬───────┬──────────────┐
│ Comparison          ┆ Dimension ┆ Differences ┆ Matches ┆ Total ┆ Match Rate % │
│ ---                 ┆ ---       ┆ ---         ┆ ---     ┆ ---   ┆ ---          │
│ str                 ┆ str       ┆ i64         ┆ i64     ┆ i64   ┆ f64          │
╞═════════════════════╪═══════════╪═════════════╪═════════╪═══════╪══════════════╡
│ jan cols not in feb ┆ columns   ┆ 2           ┆ 5       ┆ 7     ┆ 0.714286     │
│ data types          ┆ columns   ┆ 2           ┆ 3       ┆ 5     ┆ 0.6          │
│ feb rows not in jan ┆ rows      ┆ 2           ┆ 2       ┆ 4     ┆ 0.5          │
│ jan rows not in feb ┆ rows      ┆ 1           ┆ 2       ┆ 3     ┆ 0.666667     │
│ amount              ┆ values    ┆ 1           ┆ 1       ┆ 2     ┆ 0.5          │
│ date                ┆ values    ┆ 2           ┆ 0       ┆ 2     ┆ 0.0          │
│ first_name          ┆ values    ┆ 1           ┆ 1       ┆ 2     ┆ 0.5          │
│ flag                ┆ values    ┆ 1           ┆ 1       ┆ 2     ┆ 0.5          │
└─────────────────────┴───────────┴─────────────┴─────────┴───────┴──────────────┘

Export to Excel

Export the results to an excel file.

ds.to_excel()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deltascan-0.1.2.tar.gz (17.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deltascan-0.1.2-py3-none-any.whl (25.3 kB view details)

Uploaded Python 3

File details

Details for the file deltascan-0.1.2.tar.gz.

File metadata

  • Download URL: deltascan-0.1.2.tar.gz
  • Upload date:
  • Size: 17.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.3 CPython/3.12.10 Windows/11

File hashes

Hashes for deltascan-0.1.2.tar.gz
Algorithm Hash digest
SHA256 798cd5a6bc79acb21356aa53915fb34ea62468b21b1180090909cde95192a177
MD5 25b613ddd673ad0053b9d1ab30fdde4b
BLAKE2b-256 0cfed5d060684a51b001b7e439698667b8e1ffed8635ad415d607fc7384ff075

See more details on using hashes here.

File details

Details for the file deltascan-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: deltascan-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 25.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.3 CPython/3.12.10 Windows/11

File hashes

Hashes for deltascan-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 967d9e9c55b9bf02c3fe3e3efe8ecb96f0894c2d5a067c3158fcc5207d4b539c
MD5 a359d978970dc85dfd0e2b4577de9ea2
BLAKE2b-256 f16435b312744ae18d4cfb505e9aa0e9d1e4dff33da46324699d90256d02ce83

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page