Skip to main content

Package to find differences between two CSV files

Project description

:mag_right: CSV-Compare

PyPI version

A python package for Jamf MacAdmins in mind

The csvcomparetool allows you to find differences between two CSV files based on a specified column identifier. It provides a simple way to compare the contents of two CSV files and identify which records are present in one file but not in the other.

Best Practice
When comparing CSV files for differences, be sure to provide the CSV with more entries second. For example, if CSV-1 has a list of 34 names, and CSV-2 has a list of 40, CSV-2 should be set as the second passed CSV path in order for differences to show as expected.

Installation

You can install CSV-Compare using pip:

pip install csvcomparetool

Usage

To use CSV-Compare in your Python project, follow these steps.

Import the CSVComparer class from the package

from csvcomparetool import CSVComparer

Create a CSVComparer object

Note
The CSVComparer object requires two paths and a column identifier. The column identifier is the title of the column you are pulling from.
For example, 'Full Name' would be the column identifier for full names.

csv1_path = "path/to/first.csv"
csv2_path = "path/to/second.csv"
column = "identifier_column"
comparer = CSVComparer(csv1_path, csv2_path, column)

Validate the paths to ensure the specified CSV files exist

if not comparer.validate_paths():
    print("CSV file paths are invalid. Please check the file paths and try again.")
    return

Validate the column exists in the provided CSV files

if not comparer.validate_columns():
    print("Provided column not found in CSV. Check the columns and try again.")
    return

Find the differences between the two CSV files

differences = comparer.find_differences()

Print or process the differences as needed

for difference in differences:
    print(f"Record '{difference}' is present in CSV2 but not in CSV1.")

Examples

Python Project

from csvcomparetool import CSVComparer

csv1_path = "path/to/first.csv"
csv2_path = "path/to/second.csv"
column = "identifier_column"

comparer = CSVComparer(csv1_path, csv2_path, column)

if not comparer.validate_paths() or not comparer.validate_columns():
    print("CSV file paths are invalid, or the column identifier does not exist. Check the file paths and columns and try again.")
else:
    differences = comparer.find_differences()
    for difference in differences:
        print(f"Record '{difference}' is present in CSV2 but not in CSV1.")

Command Line tool

First, clone this repository to a local directory on your machine

git clone https://github.com/liquidz00/csv-compare.git

Navigate to the cloned repo location (local directory where you chose to save the repository)

cd /path/to/repo/src/csvcomparetool/

Lastly, run the following command

python cli.py /csv/path/one /csv/path/two columnidentifier

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csv-compare-tool-1.0.3.tar.gz (13.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

csv_compare_tool-1.0.3-py3-none-any.whl (12.8 kB view details)

Uploaded Python 3

File details

Details for the file csv-compare-tool-1.0.3.tar.gz.

File metadata

  • Download URL: csv-compare-tool-1.0.3.tar.gz
  • Upload date:
  • Size: 13.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.5

File hashes

Hashes for csv-compare-tool-1.0.3.tar.gz
Algorithm Hash digest
SHA256 d17e62d6394457adf63e6c5f13a3cb9ab9617076e6748306ca5de686c6dd6920
MD5 a3e9d44ffc476009fe3e0e172113d615
BLAKE2b-256 550041bf057033b31315a7d4bb48acc9991ab5cda40e70a1ef3b149d254396a5

See more details on using hashes here.

File details

Details for the file csv_compare_tool-1.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for csv_compare_tool-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 0042bf02060c006a902b8c39c1ab836cd27b2f2f7c0d90144e40f5a0a02fbdf0
MD5 e05943ade3a294bd25da57402041a933
BLAKE2b-256 c3665d20ff8bcac7d2aeb40a6a1cd370e9cbb40ffb1ddf8f473eece1718fa642

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page