Skip to main content

Rich file comparison with a focus on structured and tabular data

Project description

sdiff

Rich file comparison with a focus on structured and tabular data

mosaic-edit

About

sdiff is a diff tool and a library. You can use it to build diffs and compare strings, sequences, arrays, nested sequences, matrices, texts, tables, files, etc. It runs Myers diff algorithm under the hood. Implemented in python+Cython.

Features

sdiff is not a drop-in replacement for your diff tool. But it does some things nicely.

  • You can use it for text as usual.
  • sdiff supports tables
  • pretty fast
  • exposes low-level python API to compare/align arbitrary sequences
  • The CLI sdiff tool can be used to compare entire directories while discovering file types on the fly. It can be fine-tuned to include/exclude files, align file names through regexes, set various similarity measures, provide colored reports in various formats.

Install

pip install sdiff

Install the latest git version

pip install git+https://github.com/pulkin/sdiff.git

Examples

CLI

> sdiff a.csv b.csv
comparing a.csv vs b.csv
  Country     Region Date       Kilotons of Co2 Metric Tons Per Capita
- ----------- ------ ---------- --------------- ----------------------
(3 row(s) match)
3 Afghanistan Asia   01-01-2019 6080            0.16                  
4 Afghanistan Asia   01-01-2018 6070            0.17                  
5 Afghanistan Asia   01-01-2013 ---5990---      0.19                  
                                +++6000+++                            
6 Afghanistan Asia   01-01-2015 5950            0.18                  
7 Afghanistan Asia   01-01-2016 5300            0.15                  
(1 row(s) match)

API

from sdiff.sequence import diff

print(diff(
  ['apples', 'bananas', 'carrots', 'dill'],
  ['apples', 'carrots', 'dill', 'eggplant']
).to_string())
a≈b (ratio=0.7500)
··a[0:1]=b[0:1]: ['apples'] = ['apples']
··a[1:2]≠b[1:1]: ['bananas'] ≠ []
··a[2:4]=b[1:3]: ['carrots', 'dill'] = ['carrots', 'dill']
··a[4:4]≠b[3:4]: [] ≠ ['eggplant']

Documentation

Visit https://sdiff.readthedocs.io/en/latest/

License

LICENSE.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sdiff-0.1.8.tar.gz (263.9 kB view details)

Uploaded Source

File details

Details for the file sdiff-0.1.8.tar.gz.

File metadata

  • Download URL: sdiff-0.1.8.tar.gz
  • Upload date:
  • Size: 263.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for sdiff-0.1.8.tar.gz
Algorithm Hash digest
SHA256 e9f42d9bd612001bbf633d1fb128353d032391f4798320fc8ce964a7109b0531
MD5 61e08dcb15d7f69b7b3c3f64e4de602d
BLAKE2b-256 22c785bc3cbab656d67254de8102a48a912713556476c90d35ac824be515f50f

See more details on using hashes here.

Provenance

The following attestation bundles were made for sdiff-0.1.8.tar.gz:

Publisher: pypi.yml on pulkin/sdiff

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page