Skip to main content

Generates comparison reports for pandas DataFrames.

Project description

pandascompare

Package version License

pandascompare is a Python package designed to compare DataFrame objects, enabling you to quickly identify the differences between two datasets.

Installation

pip install pandascompare

Main Features

The PandasCompare class compares any two DataFrame objects along the following dimensions:

  • Rows ➔ discrepancies with respect to the join key(s) specified via the join_on argument.
  • Columns ➔ name differences or missing columns.
  • Values ➔ data that differs in terms of value or type.

Example Usage

Please refer to the documentation within the code for more information.

Imports

from pandascompare import PandasCompare

Create DataFrames

First, let's create two sample DataFrame objects to compare.

import pandas as pd
import numpy as np

# February Data
left_df = pd.DataFrame({
    'id': [1, 2, 3],
    'date': [pd.to_datetime('2024-02-29')] * 3,
    'first_name': ['Alice', 'Mike', 'John'],
    'amount': [10.5, 5.3, 33.77],
    })

# January Data
right_df = pd.DataFrame({
    'id': [1, 2, 9],
    'date': [pd.to_datetime('2024-01-31')] * 3,
    'first_name': ['Alice', 'Michael', 'Zachary'],
    'last_name': ['Jones', 'Smith', 'Einck'],
    'amount': [11.1, np.nan, 14],
    })

Compare DataFrames

Next, we will initialize a PandasCompare instance to perform the comparison. Please consult the in-code documentation for a comprehensive list of available arguments.

pc = PandasCompare(
    left=left_df,
    right=right_df,
    left_label='feb',
    right_label='jan',
    join_on='id',
    left_ref=['first_name'],
    include_delta=True,
    verbose=True,
    )

Export to Excel

Finally, let's export the compare report to an Excel file to view the results.

pc.to_excel()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandascompare-0.5.0.tar.gz (8.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pandascompare-0.5.0-py3-none-any.whl (9.5 kB view details)

Uploaded Python 3

File details

Details for the file pandascompare-0.5.0.tar.gz.

File metadata

  • Download URL: pandascompare-0.5.0.tar.gz
  • Upload date:
  • Size: 8.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.6 Windows/10

File hashes

Hashes for pandascompare-0.5.0.tar.gz
Algorithm Hash digest
SHA256 2849898a7da99ca1e9a5f16b6f54cd8c05c4ac2735705780e2f09ba01a5a0919
MD5 dd9c2f0e77a55bcdad102f0c80f90fb5
BLAKE2b-256 93ef26d95d2d44b245e59b643d016925454780f429f9f6cb67fe2891869a6af0

See more details on using hashes here.

File details

Details for the file pandascompare-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: pandascompare-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 9.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.6 Windows/10

File hashes

Hashes for pandascompare-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c8f753f10d9ef26fdae099748905a3679790854af0e8855f0063cd4f0dc8e840
MD5 6e8579441d23089b4bff26297d4f0d57
BLAKE2b-256 2bad20de3ede0f3216ddef1643b33dc2c5f25091ead7cb46ad813257b42326b3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page