Skip to main content

An experimental diff library for generating operation deltas that represent the difference between two sequences of comparable items.

Project description

An open licensed (MIT) library for performing generating deltas (A.K.A sequences of operations) representing the difference between two sequences of comparable tokens.

This library is intended to be used to make experimental difference detection strategies more easily available. There are currently two strategies available:

deltas.sequence_matcher.diff(a, b):
A shameless wrapper around difflib.SequenceMatcher to get it to work within the structure of deltas.
deltas.segment_matcher.diff(a, b, segmenter=None):
A generalized difference detector that is designed to detect block moves and copies based on the use of a Segmenter.
Example:
>>> from deltas import segment_matcher, text_split
>>>
>>> a = text_split.tokenize("This is some text.  This is some other text.")
>>> b = text_split.tokenize("This is some other text.  This is some text.")
>>> operations = segment_matcher.diff(a, b)
>>>
>>> for op in operations:
...     print(op.name, repr(''.join(a[op.a1:op.a2])), repr(''.join(b[op.b1:op.b2])))
...
equal 'This is some other text.' 'This is some other text.'
insert ' ' '  '
equal 'This is some text.' 'This is some text.'
delete '  ' ''

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
deltas-0.3.3.tar.gz (16.8 kB) Copy SHA256 hash SHA256 Source None Aug 12, 2015
deltas-0.3.3.zip (28.6 kB) Copy SHA256 hash SHA256 Source None Aug 12, 2015

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page