Skip to main content

Used to compare 2 Pandas DFs

Project description

pdcompare

PyPI Latest Release Code style: black

Used to compare two pandas DataFrame objects to see how they changed.

pip install pdcompare

Requirements

The DataFrames must have the same index to compare correctly. An error will be thrown if the index data-types do not match, and a warning will be thrown if the index names are different.

STEPS

Initialize and call the compare() method:

from pdcompare import Compare

compare_object = Compare(df1,df2)
compare_object.compare()

To get a dictionary of the resulting comparison data call:

compare_object.output()

Output Details

Once you call the .output() method, you will receive a dictionary object in return. This dictionary has the following keys and associated values:

KEY VALUE VALUE Data Type
SUMMARY high-level overview of differences pd.DataFrame
ADDED list of all index values that were added pd.Series
ADDED_cols list of all columns that were added pd.Series
REMOVED list of all index values that were removed pd.Series
REMOVED_cols list of all columns that were added pd.Series
CHANGED (see below for details) pd.DataFrame

CHANGED output data

This data has the following columns

Column Header Data
ID Index value by which we tracked the alterations
COLUMN Column that we saw an index change values
from Value of specified column & index in the first table (old)
to Value of specified column & index in the second table (new)

Examples

ScreamingFrog Crawl Comparison (SEO)

This is a great tool to compare crawls from different dates. Simply export the CSV files from ScreamingFrog. Then run this Google Colab notebook to create a Report in Google Sheets.

ScreamingFrog Crawl Compare in Colab

By default the code to connect to Google Sheets and do all the formatting is hidden, but feel free to peep behind the curtain to see how it was done. You can display the first block of code by opening using the drop-down triangle on the far left side of the block.

Weed Price Comparison

For a simple, get acquainted quickly example, use this. Thanks to Vicki for pointing me in the direction of these small datasets; and thanks to Frank BI for supplying the free datasets. I used Frank's weed price data from 2004 and compared them to 2005 across the 50 states. The example can be found in this repo's example folder.

Thanks for using my code

If you found this library useful, I'd appreciate a coffee.

Buy Me A Coffee

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdcompare-0.1.5.tar.gz (6.6 kB view hashes)

Uploaded Source

Built Distribution

pdcompare-0.1.5-py3-none-any.whl (7.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page