Skip to main content

Rich file comparison with a focus on structured and tabular data

Project description

sdiff

Rich file comparison with a focus on structured and tabular data

mosaic-edit

About

sdiff is a diff tool and a library. You can use it to build diffs and compare strings, sequences, arrays, nested sequences, matrices, texts, tables, files, etc. It runs Myers diff algorithm under the hood. Implemented in python+Cython.

Features

sdiff is not a drop-in replacement for your diff tool. But it does some things nicely.

  • You can use it for text as usual.
  • sdiff supports tables
  • pretty fast
  • exposes low-level python API to compare/align arbitrary sequences
  • The CLI sdiff tool can be used to compare entire directories while discovering file types on the fly. It can be fine-tuned to include/exclude files, align file names through regexes, set various similarity measures, provide colored reports in various formats.

Install

pip install sdiff

Install the latest git version

pip install git+https://github.com/pulkin/sdiff.git

Examples

CLI

> sdiff a.csv b.csv
comparing a.csv vs b.csv
  Country     Region Date       Kilotons of Co2 Metric Tons Per Capita
- ----------- ------ ---------- --------------- ----------------------
(3 row(s) match)
3 Afghanistan Asia   01-01-2019 6080            0.16                  
4 Afghanistan Asia   01-01-2018 6070            0.17                  
5 Afghanistan Asia   01-01-2013 ---5990---      0.19                  
                                +++6000+++                            
6 Afghanistan Asia   01-01-2015 5950            0.18                  
7 Afghanistan Asia   01-01-2016 5300            0.15                  
(1 row(s) match)

API

from sdiff.sequence import diff

print(diff(
  ['apples', 'bananas', 'carrots', 'dill'],
  ['apples', 'carrots', 'dill', 'eggplant']
).to_string())
a≈b (ratio=0.7500)
··a[0:1]=b[0:1]: ['apples'] = ['apples']
··a[1:2]≠b[1:1]: ['bananas'] ≠ []
··a[2:4]=b[1:3]: ['carrots', 'dill'] = ['carrots', 'dill']
··a[4:4]≠b[3:4]: [] ≠ ['eggplant']

Documentation

Visit https://sdiff.readthedocs.io/en/latest/

License

LICENSE.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sdiff-0.1.10.tar.gz (265.1 kB view details)

Uploaded Source

File details

Details for the file sdiff-0.1.10.tar.gz.

File metadata

  • Download URL: sdiff-0.1.10.tar.gz
  • Upload date:
  • Size: 265.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for sdiff-0.1.10.tar.gz
Algorithm Hash digest
SHA256 db960779f180aac9646e3b0776fc2652eca2d52c87844c2f172fd67f47cd80d4
MD5 aedff94c0060890ce270ddb5c323ef50
BLAKE2b-256 686cb17299274a126cc12c560d6d6798fad168dda2bf1a063e8242f6730c9645

See more details on using hashes here.

Provenance

The following attestation bundles were made for sdiff-0.1.10.tar.gz:

Publisher: pypi.yml on pulkin/sdiff

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page