Skip to main content

Suite of pandas utilities including a DataFrame comparison report builder.

Project description

pandascompare

Package version License

pandascompare is a Python package designed to compare DataFrame objects, enabling you to quickly identify the differences between two datasets.

Installation

pip install pandascompare

Main Features

The PandasCompare class compares any two DataFrame objects along the following dimensions:

  • Rows ➔ discrepancies with respect to the join key(s) specified via the join_on argument.
  • Columns ➔ name differences or missing columns.
  • Values ➔ data that differs in terms of value or type.

Example Usage

Please refer to the documentation within the code for more information.

Imports

from pandascompare import PandasCompare

Create DataFrames

First, let's create two sample DataFrame objects to compare.

import pandas as pd
import numpy as np

# February Data
left_df = pd.DataFrame({
    'id': [1, 2, 3],
    'date': [pd.to_datetime('2024-02-29')] * 3,
    'first_name': ['Alice', 'Mike', 'John'],
    'amount': [10.5, 5.3, 33.77],
    })

# January Data
right_df = pd.DataFrame({
    'id': [1, 2, 9],
    'date': [pd.to_datetime('2024-01-31')] * 3,
    'first_name': ['Alice', 'Michael', 'Zachary'],
    'last_name': ['Jones', 'Smith', 'Einck'],
    'amount': [11.1, np.nan, 14],
    })

Compare DataFrames

Next, we will initialize a PandasCompare instance to perform the comparison. Please consult the in-code documentation for a comprehensive list of available arguments.

pc = PandasCompare(
    left=left_df,
    right=right_df,
    left_label='feb',
    right_label='jan',
    join_on='id',
    left_ref=['first_name'],
    include_delta=True,
    verbose=True,
    )

Export to Excel

Finally, let's export the compare report to an Excel file to view the results.

pc.export_to_excel()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandascompare-0.1.3.tar.gz (9.1 kB view details)

Uploaded Source

Built Distribution

pandascompare-0.1.3-py3-none-any.whl (10.1 kB view details)

Uploaded Python 3

File details

Details for the file pandascompare-0.1.3.tar.gz.

File metadata

  • Download URL: pandascompare-0.1.3.tar.gz
  • Upload date:
  • Size: 9.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.6 Windows/10

File hashes

Hashes for pandascompare-0.1.3.tar.gz
Algorithm Hash digest
SHA256 e7958b7bd03049865cae45b25c8e5d5204d20bcdc3ed3ae1066a4c1eb2862316
MD5 42674d14db6bf3afecb3124ad99f26e9
BLAKE2b-256 99b8a6d6dc8b73717199adccbb997dfb4ab16b0f49d721389de6dc0670c4d338

See more details on using hashes here.

File details

Details for the file pandascompare-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: pandascompare-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 10.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.6 Windows/10

File hashes

Hashes for pandascompare-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 e3724966225818161d8300e5118e0dcd5614b1f8b1fa632017aa5383f91a36fb
MD5 4fe37ce5229105f96366a6ee0fdfb68f
BLAKE2b-256 0846294e1fbc96b7fa8e68c0eb4b23afa65c81dcc9228e72ba8a527869a27d60

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page