Package to find differences between two CSV files
Project description
:mag_right: CSV-Compare
A python package for Jamf MacAdmins in mind
The csvcomparetool allows you to find differences between two CSV files based on a specified column identifier. It provides a simple way to compare the contents of two CSV files and identify which records are present in one file but not in the other.
Best Practice
When comparing CSV files for differences, be sure to provide the CSV with more entries second. For example, if CSV-1 has a list of 34 names, and CSV-2 has a list of 40, CSV-2 should be set as the second passed CSV path in order for differences to show as expected.
Installation
You can install CSV-Compare using pip
:
pip install csvcomparetool
Usage
To use CSV-Compare in your Python project, follow these steps.
Import the CSVComparer
class from the package
from csvcomparetool import CSVComparer
Create a CSVComparer
object
Note
TheCSVComparer
object requires two paths and a column identifier. The column identifier is the title of the column you are pulling from.
For example, 'Full Name' would be the column identifier for full names.
csv1_path = "path/to/first.csv"
csv2_path = "path/to/second.csv"
column = "identifier_column"
comparer = CSVComparer(csv1_path, csv2_path, column)
Validate the paths to ensure the specified CSV files exist
if not comparer.validate_paths():
print("CSV file paths are invalid. Please check the file paths and try again.")
return
Validate the column exists in the provided CSV files
if not comparer.validate_columns():
print("Provided column not found in CSV. Check the columns and try again.")
return
Find the differences between the two CSV files
differences = comparer.find_differences()
Print or process the differences as needed
for difference in differences:
print(f"Record '{difference}' is present in CSV2 but not in CSV1.")
Examples
Python Project
from csvcomparetool import CSVComparer
csv1_path = "path/to/first.csv"
csv2_path = "path/to/second.csv"
column = "identifier_column"
comparer = CSVComparer(csv1_path, csv2_path, column)
if not comparer.validate_paths() or not comparer.validate_columns():
print("CSV file paths are invalid, or the column identifier does not exist. Check the file paths and columns and try again.")
else:
differences = comparer.find_differences()
for difference in differences:
print(f"Record '{difference}' is present in CSV2 but not in CSV1.")
Command Line tool
First, clone this repository to a local directory on your machine
git clone https://github.com/liquidz00/csv-compare.git
Navigate to the cloned repo location (local directory where you chose to save the repository)
cd /path/to/repo/src/csvcomparetool/
Lastly, run the following command
python cli.py /csv/path/one /csv/path/two columnidentifier
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for csv_compare_tool-1.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0042bf02060c006a902b8c39c1ab836cd27b2f2f7c0d90144e40f5a0a02fbdf0 |
|
MD5 | e05943ade3a294bd25da57402041a933 |
|
BLAKE2b-256 | c3665d20ff8bcac7d2aeb40a6a1cd370e9cbb40ffb1ddf8f473eece1718fa642 |