Skip to main content

Tools for smart operations with python dicts

Project description

ker_dict_tools

Smart tools to operate on Python's dicts.

  • dict_diff(): Function to compare two dicts reporting their differences.
  • get_value_by_path(): Get single or multiple values from a dict passing the path leading to them.
  • set_value_by_path(): Set a value in a nested dict passing the path leading to it.

The 'path' concept:

I needed a smart way to access dict whose structure could change unpredictably (for example nested list of dicts) putting them in relationship with other objects. So i developed the functions 'get_value_by_path()' and 'set_value_by_path()' that allows me to access to those values in a safe and rapid way. The path is a list of values, each one pointing to a subsequential sub-level of the dict. For example, given the dict:

dct = {
    "foo": [
        {"bar":1},
        {"baz":2}
    ]
}

the path to access to the value 2 is:

[
    "foo",      # Key for outer level of the dict
    1,          # Index for list contained in the "foo" value
    "baz"       # Key for dict at index 1 of the list
]

Since is designed to be used in a context where the actual dict could be partially unknown, the entry point to define a path to be used is the get_value_by_path() function, wich accepts a much more "elastic" list (allowing simple queries on the dict).

get_value_by_path(dct, path, fail=False, debug=False)

Allows to retrieve the value (or values) stored somewhere in the dct dict if the given path is correct (corresponds to the structure of the dict). The dct argument can be either a dict or a list of dicts. If the fail argument is passed (True), when an element in the path doesn't correspond to the the layer of the dct dict where is applied, a TypeError exception will be raised. Otherwise the function will return an empty namedtuple with 0 as value for .found attribute. If the debug argument is passed (True), the function will log (using the logging module) every operation with a debug level.

Accepted values in the path list when passed to get_values_by_path()

Dict layer type Accepted Values
List of dicts int (Index of list)
dict (key:value pair to be matched in one or more dicts inside the list)
list (of dicts) (list of dicts containing a single _key:value_pair, all to be matched in one or more dict inside the list)
str (key to be found among keys of dicts contained in the list)
string "*" (wildcard to return all the elements in the list)
Dict str (key for the dict)
string "*" (wildcard to return all items in the dict)

For example:

dct = [
{ "foo": [{"bar":1, "baz":2},{"bar":3, "baz":4}] },
{ "foo": [{"bar":5, "baz":6},{"bar":7, "baz":8}] },
]
path1 = [0, "foo", {"bar":1}, "baz"] 
path2 = [0, "foo", 0, "baz"]

path1 and path2 will lead to the value 2. With:

path3 = [0, "foo", "*", "baz"]

path3 will lead to the values 2 and 4

Object returned by get_value_by_path()

The function will return a dict_search object.

res = get_value_by_path(myDict, myPath)

res => 'dict_search'(
    found=n, # -> number of matches
    results=[
        dict_branch(
            path={list1} # Path leading to the value #1
            value={value1} # Value #1
            ),
        dict_branch(
            path={list2} # Path leading to the value #2
            value={any} # Value #2
            ),
        ...
            
    ]
    )

The function will return a 'dict_search' namedtuple, containing two attributes:

  • .found {int}: Number of elements in the dict matching the given path
  • .results {list}: List of dict_branch namedtuples, each one specifing the path and value of for the elements matching the given path.
dict_branch namedtuple structure:
  • .path {list}: "normalized" path leading to the value (contains only dict keys or list indexes)
  • .value {any}: value found at specified path

Using get_value_by_path() to query the dict

Is also possible to verify if a given value is present at a certain layer of the dict passing as path the path leading to it and specifing the value to find as last item in the path's list. The function returns a namedtuple that specifies at index 0 (.found) the number of matches for the given path.

set_value_by_path(dct, path, value, debug=False)

Allows to set a given value at a certain position of the dct dict specified with the given path. The path argument must be a list of ints or strings according to the structure of the dict (such those returned in the dict_search object from get_value_by_path()).

diff_dict(dct1, dct2, fail=False, startPath=None)

Performs a comparison from dct1 and dct2 dicts. Those two arguments must be of the same type (bot lists of dicts or simple dicts). Returns a 'diff_results' object, containing four attributes:

  • .compared {bool}: True if the comparison has been performed without problems.
  • .updated {list}: List of 'updated_item' namedtuple (see below) containing info about elements present in both dicts but with different values.
  • .added {list}: List of paths pointing elements found in the dct2 but not in the dct1.
  • .removed {list}: List of paths pointing elements found in the dct1 but not in the dct2.

The updated_item namedtuple, that populates the .updated list, has the follwing structure:

  • .path {list}: Path leading to updated element.
  • .old_value {any}: Old value for the element.
  • .old_type {type}: Old type for element's value.
  • .new_value {any}: New value for the element.
  • .new_type {type}: New type for element's value.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ker_dict_tools-0.1.1.tar.gz (13.0 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page