Rich file comparison with a focus on structured and tabular data
Project description
sdiff
Rich file comparison with a focus on structured and tabular data
About
sdiff is a diff tool and a library. You can use it to build diffs and compare strings, sequences,
arrays, nested sequences, matrices, texts, tables, files, etc. It runs Myers diff algorithm under the hood. Implemented
in python+Cython.
Features
sdiff is not a drop-in replacement for your diff tool. But it does some things nicely.
- You can use it for text as usual.
- sdiff supports tables
- pretty fast
- exposes low-level python API to compare/align arbitrary sequences
- The CLI sdiff tool can be used to compare entire directories while discovering file types on the fly. It can be fine-tuned to include/exclude files, align file names through regexes, set various similarity measures, provide colored reports in various formats.
Install
pip install sdiff
Install the latest git version
pip install git+https://github.com/pulkin/sdiff.git
Examples
CLI
> sdiff a.csv b.csv
comparing a.csv vs b.csv
Country Region Date Kilotons of Co2 Metric Tons Per Capita
- ----------- ------ ---------- --------------- ----------------------
(3 row(s) match)
3 Afghanistan Asia 01-01-2019 6080 0.16
4 Afghanistan Asia 01-01-2018 6070 0.17
5 Afghanistan Asia 01-01-2013 ---5990--- 0.19
+++6000+++
6 Afghanistan Asia 01-01-2015 5950 0.18
7 Afghanistan Asia 01-01-2016 5300 0.15
(1 row(s) match)
API
from sdiff.sequence import diff
print(diff(
['apples', 'bananas', 'carrots', 'dill'],
['apples', 'carrots', 'dill', 'eggplant']
).to_string())
a≈b (ratio=0.7500)
··a[0:1]=b[0:1]: ['apples'] = ['apples']
··a[1:2]≠b[1:1]: ['bananas'] ≠ []
··a[2:4]=b[1:3]: ['carrots', 'dill'] = ['carrots', 'dill']
··a[4:4]≠b[3:4]: [] ≠ ['eggplant']
Documentation
Visit https://sdiff.readthedocs.io/en/latest/
License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file sdiff-0.1.8.tar.gz.
File metadata
- Download URL: sdiff-0.1.8.tar.gz
- Upload date:
- Size: 263.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e9f42d9bd612001bbf633d1fb128353d032391f4798320fc8ce964a7109b0531
|
|
| MD5 |
61e08dcb15d7f69b7b3c3f64e4de602d
|
|
| BLAKE2b-256 |
22c785bc3cbab656d67254de8102a48a912713556476c90d35ac824be515f50f
|
Provenance
The following attestation bundles were made for sdiff-0.1.8.tar.gz:
Publisher:
pypi.yml on pulkin/sdiff
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sdiff-0.1.8.tar.gz -
Subject digest:
e9f42d9bd612001bbf633d1fb128353d032391f4798320fc8ce964a7109b0531 - Sigstore transparency entry: 234371436
- Sigstore integration time:
-
Permalink:
pulkin/sdiff@1197cff30fda005331e4685b3f859ee9c64184d0 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/pulkin
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@1197cff30fda005331e4685b3f859ee9c64184d0 -
Trigger Event:
workflow_dispatch
-
Statement type: