Skip to main content

Keep track of changes in data structures.

Project description

difftrack is a tool for keeping track of changes in data structures. It makes it possible for multiple “listeners” to see changes in a dict, a list or any other data structure you want to observe and support (these structures are called “dispatchers”).

difftrack has two main classes:

  • Dispatcher - acts like a data structure you write to but also sends all changes (diffs) to all its listeners.
  • Listener - a listener is connected to one dispatcher and applies incomming diffs to its internal structure so each listener looks like the original data structure after applying all those diffs.

This division allows difftrack to have multiple listeners in different stages of applying diffs, and it enables listeners with special abbilities (e.g. difftrack.utils.BoundedListDiffHandler implementing a “top N” list: the list never exceeds a certain fixed size but when some items are deleted, previously invisible elements appear).

Basic usage

In the following example we are going to create a list dispatcher (you can write to it as to a list using __setitem__, __delitem__ and insert) and two listeners that will listen for diffs and keep their own internal state.

>>> import difftrack
>>> dispatcher = difftrack.ListDispatcher()
>>> listener1 = difftrack.ListListener()
>>> listener2 = difftrack.ListListener()
>>> dispatcher.add_listener(listener1)
>>> dispatcher.add_listener(listener2) # create listeners and add them to dispatcher

>>> dispatcher.insert(0, 'AAA') # insert string 'AAA' to the first position in list
>>> listener1.get_snapshot() # Diffs are not applied until get_new_diffs() is called
[]
>>> listener1.get_new_diffs() # now we get all diffs that have not been processed yet
[(difftrack.ListDiff.INSERT, 0, 'AAA')]
>>> listener1.get_snapshot() # and we see that listener1's snapshot now contains what we expect
['AAA']
>>> listener2.get_snapshot() # second listener still hasn't got anything because we haven't read its diffs
[]

>>> dispatcher.insert(0, 'BBB') # insert new string to 'BBB'
>>> listener1.get_new_diffs() # we need to read new diffs to get current state
[(difftrack.ListDiff.INSERT, 0, 'BBB')]
>>> listener1.get_snapshot() # we inserted 'BBB' to first position so 'AAA' was moved to second position
['BBB', 'AAA']

>>> del dispatcher[0] # remove the first element from th list (now 'BBB')
>>> listener1.get_new_diffs()
[(difftrack.ListDiff.DELETE, 0, None)]
>>> listener1.get_snapshot() # we deleted 'BBB' so only 'AAA' remains
['AAA']

>>> dispatcher[0] = 'CCC' # overwrite the first element
>>> listener1.get_new_diffs()
[(difftrack.ListDiff.REPLACE, 0, 'CCC')]
>>> listener1.get_snapshot()
['CCC'] # the first and only element in list was overwritten

>>> listener2.get_new_diffs() # finally get all diffs for listener2
[(<ListDiff.INSERT: 0>, 0, 'AAA'),
 (<ListDiff.INSERT: 0>, 0, 'BBB'),
 (<ListDiff.DELETE: 2>, 0, None),
 (<ListDiff.REPLACE: 1>, 0, 'CCC')]
>>> listener2.get_snapshot() # listener2 is now also up to date
['CCC']

Similarly you can use difftrack with DictDispatcher and DictListener: you write your changes to an instance of DictDispatcher and after applying diffs to listeners you can get a snapshot of the current dictionary state.

Callbacks

on_change

We can also add a callback to a listener so that we are notified when a diff comes:

import difftrack

>>> dispatcher = difftrack.ListDispatcher()
>>> def double_inserted_items(dtype, index, value):
        ''' This generates a new diff *while the current one is processed!* '''
        if dtype is difftrack.ListDiff.INSERT:
                dispatcher[index] = value * 2

>>> listener = difftrack.ListListener(on_change = double_inserted_items) # set function as a callback
>>> dispatcher.add_listener(listener)
>>> dispatcher.insert(0, 7) # insert 7 at index 0 and expect that the result will be doubled
>>> listener.get_new_diffs()
[
        (difftrack.ListDiff.INSERT, 0, 7),
        (difftrack.ListDiff.REPLACE, 0, 14)
]
>>> listener.get_snapshot()
[14]

In this example we show the on_change callback and its ability to work with a dispatcher. Note that we are first using the ListDiff.INSERT operation but the callback triggers a ListDiff.REPLACE operation. If it would lead to ListDiff.INSERT again we would end in recursion and after 10 iterations difftrack would give up and raise an exception.

on_finalize_batch

The dispatcher may communicate to its listeners that a certain sequence of diffs belongs together, i.e. form a batch. We do this by using the dispatcher as a context manager, wrapping diff operations that belong together.

A listener may provide another callback called on_finalize_batch that gets called every time the dispatcher finishes dispatching a batch (the context is exited).

>>> import difftrack
>>> dispatcher = difftrack.DictDispatcher()
>>> def finalize():
                print('FINALIZED')

>>> def on_change(*args):
                print('CHANGE')

>>> listener = difftrack.DictListener(on_change = on_change, on_finalize_batch = finalize)
>>> dispatcher.add_listener(listener)
>>> with dispatcher: # use the dispatcher as a context manager
                dispatcher[0] = 0
                dispatcher[1] = 1
                dispatcher[2] = 2

CHANGE
CHANGE
CHANGE
FINALIZED

We can see that the on_change callback is called every time but on_finalize_batch only when we exit the context.

Utilities

There are several utilities that you might find useful.

data_mapper

Data mapper applies a function to every data field:

>>> import difftrack
>>> def mapper(data: str) -> str:
                return data.lower()
>>> dispatcher = difftrack.ListDispatcher()
>>> listener = difftrack.ListListener()
>>> dispatcher.add_listener(difftrack.data_mapper(mapper)(listener))

>>> dispatcher.insert(0, 'AAA')
>>> dispatcher.insert(0, 'BBB')
>>> listener.get_new_diffs()
[
        (difftrack.ListDiff.INSERT, 0, 'aaa'),
        (difftrack.ListDiff.INSERT, 0, 'bbb')
]
>>> listener.get_snapshot()
['bbb', 'aaa']

compact_dict_diffs

When you update a dict item several times or even delete it you sometimes don’t want to keep all the changes. You can use compaction to drop changes that cancel or override each other out:

>>> diffs = [
        (difftrack.DictDiff.SET, 'x', 123),
        (difftrack.DictDiff.SET, 'y', 456),
        (difftrack.DictDiff.SET, 'y', 9999),
        (difftrack.DictDiff.DELETE, 'x', None),
]
>>> difftrack.compact_dict_diffs(diffs)
[
        (difftrack.DictDiff.SET, 'y', 9999),
        (difftrack.DictDiff.DELETE, 'x', None),
]

compact_list_diffs

The same kind of compaction is available for lists as well:

>>> diffs = [
        (difftrack.ListDiff.INSERT, 0, 'aaa'),
        (difftrack.ListDiff.INSERT, 1, 'bbb'),
        (difftrack.ListDiff.DELETE, 0, None)
        (difftrack.ListDiff.REPLACE, 1, 'ccc'),
]
>>> difftrack.compact_list_diffs(diffs)
[
        (difftrack.ListDiff.INSERT, 1, 'ccc'),
]

BoundedListDiffHandler

If we want to keep our list bounded (capped to a certain size) we can use difftrack.BoundedListDiffHandler.

>>> import difftrack
>>> listener = difftrack.ListListener()
>>> dispatcher = difftrack.ListDispatcher()
>>> dispatcher.add_listener(difftrack.BoundedListDiffHandler(listener, 2)) # bound listener to 2 elements

>>> dispatcher.insert(0, 'a')
>>> dispatcher.insert(1, 'b')
>>> dispatcher.insert(2, 'c')
>>> dispatcher.insert(3, 'd')
>>> listener.get_new_diffs()
[
        (difftrack.ListDiff.INSERT, 0, 'a'),
        (difftrack.ListDiff.INSERT, 1, 'b'),
]
>>> listener.get_snapshot()
['a', 'b']

>>> del dispatcher[0]
>>> listener.get_new_diffs() # 'a' is deleted and 'c' moves to the empty index 1
[
        (<ListDiff.DELETE: 2>, 0, None),
        (<ListDiff.INSERT: 0>, 1, 'c')
]
>>> listener.get_snapshot()
['b', 'c']

squash_list_diffs

This function groups list diffs affecting consecutive indices.

>>> import difftrack
>>> diffs = [
        (difftrack.ListDiff.INSERT, 1, 'A'),
        (difftrack.ListDiff.INSERT, 2, 'B'),
        (difftrack.ListDiff.INSERT, 3, 'C'),
        (difftrack.ListDiff.REPLACE, 1, 'D'),
        (difftrack.ListDiff.DELETE, 1, [])
]
>>> list(difftrack.squash_list_diffs(diffs))
[
        SquashResults(operation=<difftrack.ListDiff.INSERT: 0>, start=1, stop=1, payload=['A', 'B', 'C']),
        SquashResults(operation=<difftrack.ListDiff.REPLACE: 1>, start=1, stop=2, payload=['D']),
        SquashResults(operation=<difftrack.ListDiff.DELETE: 2>, start=1, stop=2, payload=[])
]

You can see that the three consecutive inserts are squashed into a single message. Note that the result is no longer a difftrack diff.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
difftrack-0.8.2.tar.gz (11.2 kB) Copy SHA256 hash SHA256 Source None May 4, 2018

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page