Skip to main content

Suite of pandas utilities including a DataFrame comparison report builder.

Project description

pandascompare

Package version License

pandascompare is a Python package designed to compare DataFrame objects, enabling you to quickly identify the differences between two datasets.

Installation

pip install pandascompare

Main Features

The PandasCompare class compares any two DataFrame objects along the following dimensions:

  • Rows ➔ discrepancies with respect to the join key(s) specified via the join_on argument.
  • Columns ➔ name differences or missing columns.
  • Values ➔ data that differs in terms of value or type.

Example Usage

Please refer to the documentation within the code for more information.

Imports

from pandascompare import PandasCompare

Create DataFrames

First, let's create two sample DataFrame objects to compare.

import pandas as pd
import numpy as np

# February Data
left_df = pd.DataFrame({
    'id': [1, 2, 3],
    'date': [pd.to_datetime('2024-02-29')] * 3,
    'first_name': ['Alice', 'Mike', 'John'],
    'amount': [10.5, 5.3, 33.77],
    })

# January Data
right_df = pd.DataFrame({
    'id': [1, 2, 9],
    'date': [pd.to_datetime('2024-01-31')] * 3,
    'first_name': ['Alice', 'Michael', 'Zachary'],
    'last_name': ['Jones', 'Smith', 'Einck'],
    'amount': [11.1, np.nan, 14],
    })

Compare DataFrames

Next, we will initialize a PandasCompare instance to perform the comparison. Please consult the in-code documentation for a comprehensive list of available arguments.

pc = PandasCompare(
    left=left_df,
    right=right_df,
    left_label='feb',
    right_label='jan',
    join_on='id',
    left_ref=['first_name'],
    include_delta=True,
    verbose=True,
    )

Export to Excel

Finally, let's export the compare report to an Excel file to view the results.

pc.to_excel()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandascompare-0.3.0.tar.gz (11.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pandascompare-0.3.0-py3-none-any.whl (13.3 kB view details)

Uploaded Python 3

File details

Details for the file pandascompare-0.3.0.tar.gz.

File metadata

  • Download URL: pandascompare-0.3.0.tar.gz
  • Upload date:
  • Size: 11.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.6 Windows/10

File hashes

Hashes for pandascompare-0.3.0.tar.gz
Algorithm Hash digest
SHA256 afab665770cc5e51dc90a98cd762f02ca66c2972975b9e6782f7c79ccd4b73f8
MD5 2a770097bec4574f0abd9f75003c2c1e
BLAKE2b-256 cb31c4a76183ce77bf9fce4efce7fd083265169bcd4ea2ccb9202313947e51f8

See more details on using hashes here.

File details

Details for the file pandascompare-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: pandascompare-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 13.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.6 Windows/10

File hashes

Hashes for pandascompare-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7b165e70e5ac288e760ca9a2ec943f1a7c8d1a1362182a9aa7e5f70dfc200926
MD5 c9cf3b8d6c21388b5d8ed0705c95336d
BLAKE2b-256 fd816aeb6c24ad78d0c8b29908520ded0b059e8d8c7401c7921dbeb1d2560596

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page