Compare two dataframes and return column-wise difference and additional record
Project description
dataframe_diff
dataframe_diff is a micro-library which takes two dataframes as input , compares them and return two dataframes with column wise comparison and additional records.
Installation
pip install dataframe-diff
Examples
>>> import pandas as pd
>>> df1=pd.read_csv('students_1.csv')
>>> df2=pd.read_csv('students_2.csv')
>>> from dataframe_diff import dataframe_diff
>>> df1.head()
Name Subjects Marks Grade
0 Leonard Eng 70 B
1 Leonard Math 80 B
2 Leonard Physics 90 A
3 Sheldon Eng 90 A
4 Sheldon Math 99 A
>>> df2.head()
Name Subjects Marks Grade
0 Leonard Eng 75 A
1 Leonard Math 85 A
2 Leonard Physics 90 A
3 Sheldon Eng 99 A
4 Sheldon Math 99 A
>>> d1_column,d2_additional=dataframe_diff(df1, df2, key=['Name','Subjects'])
>>> d1_column
Name Subjects value_x value_y column_name
0 Leonard Eng 70 75 Marks
1 Leonard Eng B A Grade
2 Leonard Math 80 85 Marks
3 Leonard Math B A Grade
4 Sheldon Eng 90 99 Marks
5 Penny Physics 65 75 Marks
6 Penny Physics C B Grade
>>> d2_additional
Name Subjects Marks Grade sets
0 Rajesh Math 93 A df_x
1 Howard Chemistry 83 B df_y
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
dataframe_diff-0.5.tar.gz
(2.8 kB
view details)
File details
Details for the file dataframe_diff-0.5.tar.gz
.
File metadata
- Download URL: dataframe_diff-0.5.tar.gz
- Upload date:
- Size: 2.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.1.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
f9c069138a0337d2e16a1c00a5c6f18d1161182d08977e8a07622cf1af685259
|
|
MD5 |
d5a51a37e2ea94db625d477958f23c3a
|
|
BLAKE2b-256 |
059e5c8439ec8aa92ff591a9af5837396eca290529d3a224c6ca9f9437e41ffc
|