Skip to main content

Compare two dataframes and return column-wise difference and additional record

Project description


dataframe_diff is a micro-library which takes two dataframes as input , compares them and return two dataframes with column wise comparison and additional records.


pip install dataframe-diff


>>> import pandas as pd
>>> df1=pd.read_csv('students_1.csv')
>>> df2=pd.read_csv('students_2.csv')
>>> from dataframe_diff import dataframe_diff
>>> df1.head()
      Name Subjects  Marks Grade
0  Leonard      Eng     70     B
1  Leonard     Math     80     B
2  Leonard  Physics     90     A
3  Sheldon      Eng     90     A
4  Sheldon     Math     99     A
>>> df2.head()
      Name Subjects  Marks Grade
0  Leonard      Eng     75     A
1  Leonard     Math     85     A
2  Leonard  Physics     90     A
3  Sheldon      Eng     99     A
4  Sheldon     Math     99     A
>>> d1_column,d2_additional=dataframe_diff(df1, df2, key=['Name','Subjects'])
>>> d1_column
      Name Subjects value_x value_y column_name
0  Leonard      Eng      70      75       Marks
1  Leonard      Eng       B       A       Grade
2  Leonard     Math      80      85       Marks
3  Leonard     Math       B       A       Grade
4  Sheldon      Eng      90      99       Marks
5    Penny  Physics      65      75       Marks
6    Penny  Physics       C       B       Grade
>>> d2_additional
     Name   Subjects  Marks Grade  sets
0  Rajesh       Math     93     A  df_x
1  Howard  Chemistry     83     B  df_y

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframe_diff-0.5.tar.gz (2.8 kB view hashes)

Uploaded source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page