Skip to main content

Compare delimited files that share a common key.

Project description

Build Documentation Status made-with-python GitHub license



Compare delimited files that share a common key.
Explore the docs » · Report Bug · Request Feature

Table of Contents
  1. Overview
  2. Example
  3. Installation
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgments


csvcomparer is an open-source Python project used for determining differences between two delimited files (referred to here as "left" and "right" files) that share a common key, or index. Specifically, csvcomparer determines:

  • Columns exclusive to the left and right files, respectively.
  • Rows exclusive to the left and right files, respectively.
  • Field-level differences for rows/columns in common between files.

(back to top)

Basic Usage

As a python module:

from csvcomparer import CsvCompare

diffs = CsvCompare(

As a command line utility:

> python csvcomparer left_csv_filepath right_csv_filepath key


Provided the following file data:

id name pic price score togo
1A beer 🍺 $6.00 3.9 N
1B wine 🍷 $7.25 4.5 N
2A cheese 🧀 $4.10 4.0 Y
3A bacon 🥓 $3.33 4.9 Y


id name pic price stars
1A beer 🍻 $5.25 3.9
1B wine 🍷 $7.25 4.8
2A cheese 🧀 $3.95 4.1
4C taco 🌮 $8.33 3.1
5B pizza 🍕 $9.99 2.4

Usage as a Python module...

>>> from csvcomparer import CsvCompare
>>> CsvCompare("menu_l.csv", "menu_r.csv", "id").diffs

... or as a command-line utility:

> python csvcomparer menu_l.csv menu_r.csv id


{'cols_added': ['stars'],
 'cols_removed': ['score', 'togo'],
 'rows_added': {'4C': {'name': 'taco',
                       'pic': '🌮',
                       'price': '$8.33',
                       'stars': 3.1},
                '5B': {'name': 'pizza',
                       'pic': '🍕',
                       'price': '$9.99',
                       'stars': 2.4}},
 'rows_changed': {'1A': [('pic', '🍺', '🍻'), ('price', '$6.00', '$5.25')],
                  '2A': [('price', '$4.10', '$3.95')]},
 'rows_removed': {'3A': {'name': 'bacon',
                         'pic': '🥓',
                         'price': '$3.33',
                         'score': 4.9,
                         'togo': 'Y'}}}

Multi-column keys are also supported. So for the same file data:

>>> CsvCompare("menu_l.csv", "menu_r.csv", ["id", "name"]).diffs


{'cols_added': ['stars'],
 'cols_removed': ['score', 'togo'],
 'rows_added': {('4C', 'taco'): {'pic': '🌮', 'price': '$8.33', 'stars': 3.1},
                ('5B', 'pizza'): {'pic': '🍕', 'price': '$9.99', 'stars': 2.4}},
 'rows_changed': {('1A', 'beer'): [('pic', '🍺', '🍻'),
                                   ('price', '$6.00', '$5.25')],
                  ('2A', 'cheese'): [('price', '$4.10', '$3.95')]},
 'rows_removed': {('3A', 'bacon'): {'pic': '🥓',
                                    'price': '$3.33',
                                    'score': 4.9,
                                    'togo': 'Y'}}}

See the docs for detailed usage and examples.

(back to top)



Then simply run:

poetry install csvcomparer

(back to top)


csvcomparer is in its infancy, and there are high hopes for this project!

The ultimate goal is being able to compare any two data sets that can be consumed as a "dataframe", regardless of size, efficiently as possible. This comes with a great deal of challenges, but I'm confident it will get there.

See the open issues for a full list of proposed features (and known issues).

(back to top)


Any contributions are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)


Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)


Ryan Bergsmith - LinkedIn -
Project Link: Github

(back to top)


(back to top)

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csvcomparer-0.1.0.tar.gz (9.2 kB view hashes)

Uploaded Source

Built Distribution

csvcomparer-0.1.0-py3-none-any.whl (8.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page