Skip to main content

A simple Python library for merging dataframes accommodating variations in language, spelling, and typos.

Project description

EasyMerge: DataFrame Merger

Overview

The merge function in this Python library facilitates the iterative merging of multiple DataFrames based on a specified column. This functionality is particularly useful when dealing with diverse datasets that may contain variations in language, spelling, or typos.

Installation

You can install the library using pip:

pip install easymerge

Usage

Consider the following example where we have multiple DataFrames containing information about countries:

Dataframe 1:

Country Capital
USA Washington DC
China Beijing
India New Delhi
FrAnce Paris
Canada Ottawa

Dataframe 2:

Country Population (in Millions)
Canada 38.25
United States 331.90
China 1410.78
Indía 1417.00

Dataframe 3:

Country Continent
UsA North America
ChinA Asia
indIa Asia
france Europe
CaNada North America

We can merge these DataFrames using the merge function:

from easymerge import merger
import pandas as pd

# Define the DataFrames
df1 = pd.DataFrame({
'Country': ['USA', 'China', 'India','FrAnce', 'Canada'],
'Capital': ['Washington DC', 'Beijing', 'New Delhi', 'Paris', 'Ottawa']
})

df2 = pd.DataFrame({
'Country': ['canada','United States', 'China', 'Indía'],=
'Population (in Millions)': [38.25, 331.9, 1410.78, 1417]
})

df3 = pd.DataFrame({
'Country': ['UsA', 'ChinA', 'indIa','france', 'CaNada'],
'Continent ': ['North America', 'Asia', 'Asia', 'Europe', 'North America']
})

# Perform iterative merging
merged_df = merger.merge(df1, df2, df3, column_name='Country')

print(merged_df)

This code will merge the provided DataFrames based on the "Country" column, accounting for variations in case and spelling. The resulting DataFrame will contain all the information merged based on the specified column.

Merged Dataframe:

Country Continent Population (in Millions) Capital
USA North America 331.90 Washington DC
China Asia 1410.78 Beijing
India Asia 1417.00 New Delhi
FrAnce Europe NaN Paris
Canada North America 38.25 Ottawa

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

easymerge-0.4.tar.gz (2.7 kB view details)

Uploaded Source

File details

Details for the file easymerge-0.4.tar.gz.

File metadata

  • Download URL: easymerge-0.4.tar.gz
  • Upload date:
  • Size: 2.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.7

File hashes

Hashes for easymerge-0.4.tar.gz
Algorithm Hash digest
SHA256 81fe3a046c538c3cf7683450886dbabf8b204abae434361c8224d831f797e6ea
MD5 483fcbb497eabdf04dc4f4309054e25e
BLAKE2b-256 50e53ea3c31a34b40717b1dfdf76701719c61963d78e4ed68a3a1fdcc032b804

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page