Suite of pandas utilities including a DataFrame comparison report builder.
Project description
pandascompare
pandascompare is a Python package designed to compare DataFrame objects, enabling you to quickly identify the differences between two datasets.
Installation
pip install pandascompare
Main Features
The PandasCompare class compares any two DataFrame objects along the following dimensions:
Rows➔ discrepancies with respect to the join key(s) specified via thejoin_onargument.Columns➔ name differences or missing columns.Values➔ data that differs in terms of value or type.
Example Usage
Please refer to the documentation within the code for more information.
Imports
from pandascompare import PandasCompare
Create DataFrames
First, let's create two sample DataFrame objects to compare.
import pandas as pd
import numpy as np
# February Data
left_df = pd.DataFrame({
'id': [1, 2, 3],
'date': [pd.to_datetime('2024-02-29')] * 3,
'first_name': ['Alice', 'Mike', 'John'],
'amount': [10.5, 5.3, 33.77],
})
# January Data
right_df = pd.DataFrame({
'id': [1, 2, 9],
'date': [pd.to_datetime('2024-01-31')] * 3,
'first_name': ['Alice', 'Michael', 'Zachary'],
'last_name': ['Jones', 'Smith', 'Einck'],
'amount': [11.1, np.nan, 14],
})
Compare DataFrames
Next, we will initialize a PandasCompare instance to perform the comparison. Please consult the in-code documentation for a comprehensive list of available arguments.
pc = PandasCompare(
left=left_df,
right=right_df,
left_label='feb',
right_label='jan',
join_on='id',
left_ref=['first_name'],
include_delta=True,
verbose=True,
)
Export to Excel
Finally, let's export the compare report to an Excel file to view the results.
pc.export_to_excel()
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pandascompare-0.1.3-py3-none-any.whl
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 | e3724966225818161d8300e5118e0dcd5614b1f8b1fa632017aa5383f91a36fb |
|
| MD5 | 4fe37ce5229105f96366a6ee0fdfb68f |
|
| BLAKE2b-256 | 0846294e1fbc96b7fa8e68c0eb4b23afa65c81dcc9228e72ba8a527869a27d60 |