Skip to main content

``nonesafe``: safe to read, write, and read/modify/write ``dicts`` from external data

Project description

Licenses

Code

MIT

Rest

Creative Commons by Attribution

Installation

pip install --upgrade nonesafe

Or copy nonesafe.py into current folder (note LICENCE), run some examples by executing nonesafe.py

Home: Github

Motivation

When parsing a dictionary from an external source, e.g. a JSON request, dictionary keys might be missing or there may be unknown dictionary keys or values might be None.

For example suppose you know (or only care about) keys a and b at the top level and that a is also a dictionary that has a c.

>>> d_ok = {'a': {'c': 1}, 'b': 0}

This would be easy to use directly as a dictionary:

>>> d_ok['a']
{'c': 1}
>>> d_ok['a']['c']
1
>>> d_ok['b']
0

But if instead d_ok from the external source you got:

>>> d_not_ok = {'a': {'c': 1}, 'not_b': 0}

Then the code above using a dictionary would fail. You could write safe accessor functions:

>>> from typing import Any
>>> def get_a(d: dict[str, Any] | None) -> Any | None:
...     return None if d is None else d.get('a', None)
>>> def get_b(d: dict[str, Any] | None) -> Any | None:
...     return None if d is None else d.get('b', None)
>>> def get_a_c(d: dict[str, Any] | None) -> Any | None:
...     a = get_a(d)
...     return None if a is None else a.get('c', None)

But “there must be a better way” (apologies to Raymond Hettinger):

>>> from nonesafe import *
>>> A = nsdict('A', c=int)
>>> Safe = nsdict('Safe', a=A, b=int)

nsdict creates a new class who’s constructor accepts a dict (or similar) and then copies the values from the dict into the new class. Missing values in the dict are replaced with None. If an embedded dict is missing, it is replaced with a new class whose fields (leafs) are None. In the example there are two classes created A and Safe, two classes because there is a nested dictionary in the data.

>>> s = Safe(d_not_ok)
>>> s.a
A(c=1)
>>> s.a.c
1
>>> s.b

The missing value b is replaced by None (in the doctest above None is treated as not returning a value) and the extra value not_b is ignored. The usage s.expr indicates safe (will not raise an access exception but might return None instead).

There is also three utility functions.

nsget(value, default) takes a value that might be None and if it is returns default. EG:

>>> nsget(s.b, -1)
-1

nssub(subscriptable, index) takes a lst that might be None and if it is returns None, else returns subscriptable subscripted by index. EG:

>>> nssub([0], 0)
0
>>> nssub(None, 0)

The intended use of nssub is a list that might be None, nsdict is generally better for a dict.

nscall(callable, *args, **kwargs) takes a callable that might be None and if it is returns None, else returns callable called with args and kwargs. EG:

>>> nscall(lambda x, y: (x, y), 0, y=1)
(0, 1)
>>> nscall(None)

The above has only discussed reading external data. Hand coding safe writing is cumbersome.

>>> def set_a(d: dict[str, Any] | None, value: Any) -> dict[str, Any]:
...     if d is None:
...         d = {}
...     d['a'] = value
...     return d
>>> def set_b(d: dict[str, Any] | None, value: Any) -> dict[str, Any]:
...     if d is None:
...         d = {}
...     d['b'] = value
...     return d
>>> def set_a_c(d: dict[str, Any] | None, value: Any) -> dict[str, Any]:
...     if d is None:
...         d = {}
...     a = d.get('a', {})
...     a['c'] = value
...     return d

Writing is much easier using nonesafe than the above, EG:

>>> out = Safe()

Just an instance of the required safe version of the dict is needed. In use:

>>> out.a.c = 0
>>> out.todict()
{'a': {'c': 0}}

Note how the embedded dict is auto-created and the b field which is None is omitted to reduce payload size when writing externally.

Reading/modifying/writing external data is cumbersome to hand code (more so than reading and writing alone) and therefore the hand code is not shown. With nonesafe it is easy. Consider a particularly tricky example, suppose we read:

>>> tricky = {'b': None, 'unknown': 'u'}

Then you wish to add in a.c:

>>> st = Safe(tricky)
>>> st.a.c = 0

Finally write it out again:

>>> st.todict()
{'b': None, 'unknown': 'u', 'a': {'c': 0}}

There is a lot going on this example:

  1. a.c has been added at the end, note it is not in input tricky hence at end.

  2. b despite being None is in output, because it was in tricky. If a field is in the input it is retained; even if None, which would normally be trimmed.

  3. unknown is retained, even though Safe doesn’t know about this field. It is retained because it is in the input.

Details

The function nsdict makes a shallow copy of it’s arguments. The shallow copy is first made dict_fields argument and then updated with the kw_fields arguments. Therefore:

>>> Ex = nsdict('Ex', {'a': int}, a=A)

Matches:

>>> Ex({'a': {'c': 0}})
Ex(a=A(c=0))

The function nsdict is very flexible (following Postel), the following are all the same as each other:

>>> Ex0 = nsdict('Ex0', {'a': int, 'b': int})
>>> Ex1 = nsdict('Ex1', [('a', int), ('b', int)])
>>> Ex2 = nsdict('Ex2', a=int, b=int)
>>> Ex3 = nsdict('Ex3', {'a': int}, b=int)
>>> Ex4 = nsdict('Ex4', [('a', int)], b=int)

There is a reserved field name __orig_values__ that is used by todict to restore values from the original dict.

Like creating a class with``nsdict``; when an instance of the created class is instantiated, it too makes a shallow copy of its arguments. First dict_values and then kw_values, therefore:

>>> Ex({'a': 0}, a=A(c=0))
Ex(a=A(c=0))

Constructing an instance of a nonsafe class is also very flexible (again following Postel), the following are all the same as each other:

>>> ex0 = Ex0({'a': 0, 'b': 1})
>>> ex1 = Ex0([('a', 0), ('b', 1)])
>>> ex2 = Ex0(a=0, b=1)
>>> ex3 = Ex0({'a': 0}, b=1)
>>> ex4 = Ex0([('a', 0)], b=1)

and these are also the same as each other:

>>> ex5 = Ex0({})
>>> ex6 = Ex0([])
>>> ex7 = Ex0(None)
>>> ex8 = Ex0()

Alternatives

In general there are a lot of discussions and suggestions in this space, e.g.:

Which demonstrates nonesafe’s value, but shows there is no consensus. Therefore, having an officially sanctioned approach, in stdlib, has value.

nonsafe can be used to read, write, and read/modify/write external data. For reading only there are alternatives.

Reading

Very similar reading behaviour can be achieved with packages like Pydantic, but they are much too heavyweight for casual use and their inclusion has previously been rejected in favour of dataclasses (PEP 557).

There are many other similar approaches to pydantic:

There are also specification languages that parse strings that specify the data, e.g.:

These ‘schemas’ are generally difficult to use when the data from the external source changes and you have to specify all the data and not just the parts you are interested in.

There is also a rejected PEP 505 and a proposal to revive it Revisiting PEP 505 that failed to reach a consensus. 505 proposed introducing new None aware operators ?? (same as nsget), ?., and ?[] (last two equivalent to nsdict’s behaviour for dict). This module is considerably easier to add than three operators (current proof on concept circa 100 lines) and is arguably superior, because it is declarative. Note operators also need to be added to IDE’s, type-checkers, etc. and need to be taught. For newbies and none computer-science people they will be unfamiliar. There is an advantage with the 505 built in operators, they delay the evaluation of their right-hand argument. It is not possible to do this in Python except inside the compiler (as it does for and and or).

Writing

PEP 505 has some capability to write, but cannot write nested data easily. Each level has to be manually written.

Read/Modify/Write

There is nothing available that supports this use case directly, but you could hand code using other packages or dicts directly.

Summary

There is great interest in this area, but no standard. There are 3rd party alternatives for reading, that are large and complicated and some already rejected because of their size and complication. There are no good 3rd party or PEP alternatives available for writing and read/modify/writing.

Personal note

My motivation for writing nonesafe came from a previous company where we supplied a wrapper around a JSON API to customers (that was built using dataclasses) and also from processing data from an internal Asana database (this code used Pandas). In both cases the nonesafe library would have been superior (but I hadn’t thought of it!).

Possibilities for the future

In no particular order:

  1. Check field value is of correct type or None (auto-convert if possible). JSON data can be painful where "0" or 0 can be any of bool, float, or int.

  2. field specifier that allows a custom type converter, checkers for things like ranges, marking a field as required, and defaults other than None.

  3. Allow nsdict to be used as a class decorator. Copy docstring from decorated classes. Add something like __post_init__ to check interrelated field values.

  4. Add a.b.set(‘c’, default) - Note c has to be a leaf and is given separately as a field name as a str, used instead of a.b.c = nsget(a.b.c, default).

  5. Use __slots__.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nonesafe-0.1.7.tar.gz (9.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nonesafe-0.1.7-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file nonesafe-0.1.7.tar.gz.

File metadata

  • Download URL: nonesafe-0.1.7.tar.gz
  • Upload date:
  • Size: 9.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for nonesafe-0.1.7.tar.gz
Algorithm Hash digest
SHA256 9951578775384adcff443d47683459f157ee8feec106b641f9ea2608f09c8aca
MD5 a7422e1d625c0a82774c03ab553a1c90
BLAKE2b-256 ab6f4629625aac57b77e94c5a617b51b62df78fe27ff5b2c42e45fd03eb9fed4

See more details on using hashes here.

File details

Details for the file nonesafe-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: nonesafe-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 9.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for nonesafe-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 8294f536c35e85b5423248d6541312136386b498084f02c3c74a2803ec9f6fd5
MD5 433648ac07becbd0a62a2e30de6951d3
BLAKE2b-256 39586d26d1a3060e3a75c6842f32c750d0bde974a89184e7afa24b592e463a6b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page