Skip to main content

Used to compare 2 Pandas DFs

Project description

pdcompare

PyPI Latest Release Code style: black

Used to compare two pandas DataFrame objects to see how they changed.

pip install pdcompare

Requirements

  • DataFrames must have the same index to compare correctly
  • Warning will be thrown if the data types are non-numeric and you use .set_change_comparison(True)
  • Warning will be thrown if the index names are different

STEPS

Initialize and call the compare() method:

from pdcompare import Compare

compare_object = Compare(df1,df2)
compare_object.compare()

To get a dictionary of the resulting comparison data call:

compare_object.output()

Output Details

Once you call the .output() method, you will receive a dictionary object in return. This dictionary has the following keys and associated values:

KEY VALUE VALUE Data Type
SUMMARY high-level overview of differences pd.DataFrame
ADDED list of all index values that were added pd.Series
ADDED_cols list of all columns that were added pd.Series
REMOVED list of all index values that were removed pd.Series
REMOVED_cols list of all columns that were added pd.Series
CHANGED (see below for details) pd.DataFrame

CHANGED output data

This data has the following columns

Column Header Data
ID Index value by which we tracked the alterations
COLUMN Column that we saw an index change values
from Value of specified column & index in the first table (old)
to Value of specified column & index in the second table (new)

Examples

ScreamingFrog Crawl Comparison (SEO)

This is a great tool to compare crawls from different dates. Simply export the CSV files from ScreamingFrog. Then run this Google Colab notebook to create a Report in Google Sheets.

ScreamingFrog Crawl Compare in Colab

By default the code to connect to Google Sheets and do all the formatting is hidden, but feel free to peep behind the curtain to see how it was done. You can display the first block of code by opening using the drop-down triangle on the far left side of the block.

Weed Price Comparison

For a simple, get acquainted quickly example, use this. Thanks to Vicki for pointing me in the direction of these small datasets; and thanks to Frank BI for supplying the free datasets. I used Frank's weed price data from 2004 and compared them to 2005 across the 50 states. The example can be found in this repo's example folder.

Thanks for using my code

If you found this library useful, I'd appreciate a coffee.

Buy Me A Coffee

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdcompare-0.1.6.tar.gz (6.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdcompare-0.1.6-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file pdcompare-0.1.6.tar.gz.

File metadata

  • Download URL: pdcompare-0.1.6.tar.gz
  • Upload date:
  • Size: 6.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for pdcompare-0.1.6.tar.gz
Algorithm Hash digest
SHA256 ab517de0c60914317cc02be8401708c96db78ddb72b085ffd4cb9b37929508a4
MD5 a5e10f5d873d8af7f2086c858a40b292
BLAKE2b-256 96e3c32862e1373fbbfb74247f136b1a429062ddb54f8a9140cb1bedc82efc18

See more details on using hashes here.

File details

Details for the file pdcompare-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: pdcompare-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for pdcompare-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 bf84e17d8decf155355efc8ada169856fad12a24624f0688e44d3e5cde5b263e
MD5 32700b4f3afacb5a7369b74a95a9820c
BLAKE2b-256 1e35579f50320b17ad9d29e996513ebd67d3c40be0d000e914c1d222c0fcdd67

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page