Skip to main content

Generates comparison reports for pandas DataFrames.

Project description

pandascompare

Package version License

pandascompare is a Python package designed to compare DataFrame objects, enabling you to quickly identify the differences between two datasets.

Installation

pip install pandascompare

Main Features

The PandasCompare class compares any two DataFrame objects along the following dimensions:

  • Rows ➔ missing rows based on the join key(s) specified via the join_on argument.
  • Columns ➔ name differences or missing columns.
  • Values ➔ value mismatches in content or type.

Example Usage

Please refer to the documentation within the code for more information.

Imports

from pandascompare import PandasCompare

Create DataFrames

First, let's create two sample DataFrame objects to compare.

import pandas as pd
import numpy as np

# February Data
left_df = pd.DataFrame({
    'id': [1, 2, 3],
    'date': [pd.to_datetime('2024-02-29')] * 3,
    'first_name': ['Alice', 'Mike', 'John'],
    'amount': [10.5, 5.3, 33.77],
    })

# January Data
right_df = pd.DataFrame({
    'id': [1, 2, 9],
    'date': [pd.to_datetime('2024-01-31')] * 3,
    'first_name': ['Alice', 'Michael', 'Zachary'],
    'last_name': ['Jones', 'Smith', 'Einck'],
    'amount': [11.1, np.nan, 14],
    })

Compare DataFrames

Next, we will initialize a PandasCompare instance to perform the comparison. Please consult the in-code documentation for a comprehensive list of available arguments.

pc = PandasCompare(
    left_data=left_df,
    right_data=right_df,
    left_label='feb',
    right_label='jan',
    join_on='id',
    left_ref=['first_name'],
    include_delta=True,
    verbose=True,
    )

Export to Excel

Finally, let's export the compare report to an Excel file to view the results.

pc.to_excel()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandascompare-0.6.3.tar.gz (9.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pandascompare-0.6.3-py3-none-any.whl (10.5 kB view details)

Uploaded Python 3

File details

Details for the file pandascompare-0.6.3.tar.gz.

File metadata

  • Download URL: pandascompare-0.6.3.tar.gz
  • Upload date:
  • Size: 9.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.6 Windows/10

File hashes

Hashes for pandascompare-0.6.3.tar.gz
Algorithm Hash digest
SHA256 b2f690dcb5a6b2b3b36f8f9454371d5cc07fe0418213a41cfa5eb43a7f85da70
MD5 73ef72645127f4a7d86a0e78c29729cc
BLAKE2b-256 37908c37efad0dfe6c9f987ffb0e498c81715ee89f087b7ab6e012231a96033c

See more details on using hashes here.

File details

Details for the file pandascompare-0.6.3-py3-none-any.whl.

File metadata

  • Download URL: pandascompare-0.6.3-py3-none-any.whl
  • Upload date:
  • Size: 10.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.6 Windows/10

File hashes

Hashes for pandascompare-0.6.3-py3-none-any.whl
Algorithm Hash digest
SHA256 c6d942dd3c7ddaa1f75276c3ceff24e3faddde52ba8e1b4cd6f58fed469f2419
MD5 bc4db33bd1901f4f1555ca0d7d2f4744
BLAKE2b-256 13a2d2d4e53c91d24a2b9adfd9ce5d3572f181cef43421550e45a49c6fff93b5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page