Skip to main content

Utility package for comparing polars dataframes.

Project description


diffly โ€” A utility package for comparing ๐Ÿปโ€โ„๏ธ DataFrames

CI conda-forge pypi-version python-version codecov

๐Ÿ—‚ Table of Contents

๐Ÿ“– Introduction

Diffly is a Python package for comparing Polars DataFrames with detailed analysis capabilities. It identifies differences between datasets including schema differences, row-level mismatches, missing rows, and column value changes.

๐Ÿ’ฟ Installation

You can install diffly using your favorite package manager, e.g., pixi or pip:

pixi add diffly
pip install diffly

๐ŸŽฏ Usage

import polars as pl
from diffly import compare_frames

left = pl.DataFrame({
    "id": ["a", "b", "c"],
    "value": [1.0, 2.0, 3.0],
})

right = pl.DataFrame({
    "id": ["a", "b", "d"],
    "value": [1.0, 2.5, 4.0],
})

comparison = compare_frames(left, right, primary_key="id")

if not comparison.equal():
    summary = comparison.summary(
        top_k_column_changes=1,
        show_sample_primary_key_per_change=True
    )
    print(summary)
โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ                                     Diffly Summary                                     โ”ƒ
โ”—โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”›
   Primary key: id

 Schemas
 โ–”โ–”โ–”โ–”โ–”โ–”โ–”
   Schemas match exactly (column count: 2).

 Rows
 โ–”โ–”โ–”โ–”
   Left count             Right count
       3      (no change)      3

   โ”โ”โ”ฏโ”โ”ฏโ”โ”ฏโ”โ”ฏโ”โ”“
   โ”ƒ-โ”‚-โ”‚-โ”‚-โ”‚-โ”ƒ                1  left only   (33.33%)
   โ” โ”€โ”ผโ”€โ”ผโ”€โ”ผโ”€โ”ผโ”€โ”จโ•Œโ•Œโ•Œโ”โ”โ”ฏโ”โ”ฏโ”โ”ฏโ”โ”ฏโ”โ”“โ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•ฎ
   โ”ƒ โ”‚ โ”‚ โ”‚ โ”‚ โ”ƒ = โ”ƒ โ”‚ โ”‚ โ”‚ โ”‚ โ”ƒ  1  equal       (50.00%)  โ”‚
   โ” โ”€โ”ผโ”€โ”ผโ”€โ”ผโ”€โ”ผโ”€โ”จโ•Œโ•Œโ•Œโ” โ”€โ”ผโ”€โ”ผโ”€โ”ผโ”€โ”ผโ”€โ”จโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ”œโ•ด  2  joined
   โ”ƒ โ”‚ โ”‚ โ”‚ โ”‚ โ”ƒ โ‰  โ”ƒ โ”‚ โ”‚ โ”‚ โ”‚ โ”ƒ  1  unequal     (50.00%)  โ”‚
   โ”—โ”โ”ทโ”โ”ทโ”โ”ทโ”โ”ทโ”โ”›โ•Œโ•Œโ•Œโ” โ”€โ”ผโ”€โ”ผโ”€โ”ผโ”€โ”ผโ”€โ”จโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•Œโ•ฏ
                 โ”ƒ+โ”‚+โ”‚+โ”‚+โ”‚+โ”ƒ  1  right only  (33.33%)
                 โ”—โ”โ”ทโ”โ”ทโ”โ”ทโ”โ”ทโ”โ”›

 Columns
 โ–”โ–”โ–”โ–”โ–”โ–”โ–”
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚ value โ”‚ 50.00% โ”‚ 2.0 -> 2.5 (1x, e.g. "b") โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

See more examples in the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diffly-1.0.0.tar.gz (221.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

diffly-1.0.0-py3-none-any.whl (25.7 kB view details)

Uploaded Python 3

File details

Details for the file diffly-1.0.0.tar.gz.

File metadata

  • Download URL: diffly-1.0.0.tar.gz
  • Upload date:
  • Size: 221.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for diffly-1.0.0.tar.gz
Algorithm Hash digest
SHA256 eee2092c5e6aa402822312230018cf70e386ac8ec3c581cebf7bd58f7cbd15f7
MD5 900a414446d1b14e08d9852234d0a6cf
BLAKE2b-256 e00d5d9bca3977283653ef26e7ebbbc5b1e1f8467c2be57d6d3c875e1a7e49fc

See more details on using hashes here.

Provenance

The following attestation bundles were made for diffly-1.0.0.tar.gz:

Publisher: build.yml on Quantco/diffly

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diffly-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: diffly-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 25.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for diffly-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 13a5eb8e0ab9f46cf02859edbd22cd54cfd9b6d70851a2007c27c6e6865bcf30
MD5 cc78e07550e4fae3ef35ba8736193b3d
BLAKE2b-256 65fcf31285126a797a0447adb4403e41be380fd87c515825d4a4cdabf4e2f1d7

See more details on using hashes here.

Provenance

The following attestation bundles were made for diffly-1.0.0-py3-none-any.whl:

Publisher: build.yml on Quantco/diffly

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page