Skip to main content

A tool for comparing two Pandas DataFrame objects

Project description

dfcompy

Description

dfcompy is a Python package that provides a comprehensive tool for comparing two Pandas DataFrame objects. It can identify rows that are inserted, deleted, or updated between two DataFrames, catering especially to data analysis and data cleaning processes.

Installation

Install dfcompy using pip:

pip install dfcompy

Usage

import pandas as pd

from dfcompy import DataFrameComparator



# Create example DataFrames

# ... [example DataFrame creation]



# Create a DataFrameComparator instance

comparator = DataFrameComparator(df1, df2, on=['ID'], subset=['Name', 'Age'])



# Detect deleted rows

print("Deleted Rows:")

print(comparator.rows_deleted())



# Detect inserted rows

print("\nInserted Rows:")

print(comparator.rows_inserted())



# Detect updated rows

print("\nUpdated Rows:")

print(comparator.rows_before_update())



# Detect unchanged rows

print("\nUnchanged Rows:")

print(comparator.rows_in_common())

Contributing

Contributions are welcome! For major changes, please open an issue first to discuss what you would like to change.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dfcompy-1.0.0.tar.gz (5.3 kB view details)

Uploaded Source

File details

Details for the file dfcompy-1.0.0.tar.gz.

File metadata

  • Download URL: dfcompy-1.0.0.tar.gz
  • Upload date:
  • Size: 5.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.12.0

File hashes

Hashes for dfcompy-1.0.0.tar.gz
Algorithm Hash digest
SHA256 27e96ba49ffe4c209af1c64cd95388dc4ec01f025b499d18542602cb5f3efe6a
MD5 a84b1337adf284effb4694178789a1a7
BLAKE2b-256 3748d5c881f7b49bfa7c496c3546206a8bf3a8c962bc3e385d256ae6c747bb5c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page