A tool for comparing two Pandas DataFrame objects
Project description
dfcompy
Description
dfcompy
is a Python package that provides a comprehensive tool for comparing two Pandas DataFrame objects. It can identify rows that are inserted, deleted, or updated between two DataFrames, catering especially to data analysis and data cleaning processes.
Installation
Install dfcompy
using pip:
pip install dfcompy
Usage
import pandas as pd
from dfcompy import DataFrameComparator
# Create example DataFrames
# ... [example DataFrame creation]
# Create a DataFrameComparator instance
comparator = DataFrameComparator(df1, df2, on=['ID'], subset=['Name', 'Age'])
# Detect deleted rows
print("Deleted Rows:")
print(comparator.rows_deleted())
# Detect inserted rows
print("\nInserted Rows:")
print(comparator.rows_inserted())
# Detect updated rows
print("\nUpdated Rows:")
print(comparator.rows_before_update())
# Detect unchanged rows
print("\nUnchanged Rows:")
print(comparator.rows_in_common())
Contributing
Contributions are welcome! For major changes, please open an issue first to discuss what you would like to change.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
dfcompy-1.0.0.tar.gz
(5.3 kB
view details)
File details
Details for the file dfcompy-1.0.0.tar.gz
.
File metadata
- Download URL: dfcompy-1.0.0.tar.gz
- Upload date:
- Size: 5.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.12.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 27e96ba49ffe4c209af1c64cd95388dc4ec01f025b499d18542602cb5f3efe6a |
|
MD5 | a84b1337adf284effb4694178789a1a7 |
|
BLAKE2b-256 | 3748d5c881f7b49bfa7c496c3546206a8bf3a8c962bc3e385d256ae6c747bb5c |