Dotted notation parser with pattern matching
Project description
Dotted
Sometimes you want to fetch data from a deeply nested data structure. Dotted notation helps you do that.
Let's say you have a dictionary containing a dictionary containing a list and you wish to fetch the ith value from that nested list.
>>> import dotted
>>> d = {'hi': {'there': [1, 2, 3]}}
>>> dotted.get(d, 'hi.there[1]')
2
API
Probably the easiest thing to do is pydoc the api layer.
$ pydoc dotted.api
Get
See grammar discussion below about things you can do get data via dotted.
>>> import dotted
>>> dotted.get({'a': {'b': {'c': {'d': 'nested'}}}}, 'a.b.c.d')
'nested'
Update
Update will mutate the object if it can. It always returns the changed object though. If it's not mutable, then get via the return.
>>> import dotted
>>> l = []
>>> t = ()
>>> dotted.update(l, '[0]', 'hello')
['hello']
>>> l
['hello']
>>> dotted.update(t, '[0]', 'hello')
('hello',)
>>> t
()
```
Update via pattern
You can update all fields that match pattern given by either a wildcard OR regex.
>>> import dotted
>>> d = {'a': 'hello', 'b': {'bye'}}
>>> dotted.update(d, '*', 'me')
{'a': 'me', 'b': 'me'}
Remove
You can remove a field or do so only if it matches value. For example,
>>> import dotted
>>> d = {'a': 'hello', 'b': 'bye'}
>>> dotted.remove(d, 'b')
{'a': 'hello'}
>>> dotted.remove(d, 'a', 'bye')
{'a': 'hello'}
Remove via pattern
Similar to update, all patterns that match will be removed. If you provide a value as well, only the matched patterns that also match the value will be removed.
Match
Use to match a dotted-style pattern to a field. Partial matching is on by default. You can match via wildcard OR via regex. Here's a regex example:
>>> import dotted
>>> dotted.match('/a.+/', 'abced.b')
'abced.b'
>>> dotted.match('/a.+/', 'abced.b', partial=False)
With the groups=True
parameter, you'll see how it was matched:
>>> import dotted
>>> match('hello.*', 'hello.there.bye', groups=True)
('hello.there.bye', ('hello', 'there.bye'))
In the above example, hello
matched to hello
and *
matched to there.bye
(partial
matching is enabled by default).
Expand
You may wish to expand all fields that match a pattern in an object.
>>> import dotted
>>> d = {'hello': {'there': [1, 2, 3]}, 'bye': 7}
>>> dotted.expand(d, '*')
('hello', 'bye')
>>> dotted.expand(d, '*.*')
('hello.there',)
>>> dptted.expand(d, '*.*[*]')
('hello.there[0]', 'hello.there[1]', 'hello.there[2]')
>>> dotted.expand(d, '*.*[1:]')
('hello.there[1:]',)
Grammar
Dotted notation shares similarities with python. A dot .
field expects to see a
dictionary-like object (using keys
and __getitem__
internally. A bracket []
field is biased towards sequences (like lists or strs) but can also act on dicts. A
attr @
field uses getattr/setattr/delattr
. Dotted also support slicing notation
as well as transforms discussed below.
Key fields
A key field is expressed as a
or part of a dotted expression, such as a.b
. The
grammar parser is permissive for what can be in a key field. Pretty much any non-reserved
char will match. Note that key fields will only work on objects that have a keys
method. Basically, they work with dictionary or dictionary-like objects.
>>> import dotted
>>> dotted.get({'a': {'b': 'hello'}}, 'a.b')
'hello'
If the key field starts with a space or -
, you should either quote it OR you may use
a \
as the first char.
Bracketed fields
You may also use bracket notation, such as a[0]
which does a __getitem__
at key 0.
The parser prefers numeric types over string types (if you wish to look up a non-numeric
field using brackets be sure to quote it). Bracketed fields will work with pretty much
any object that can be looked up via __getitem__
.
>>> import dotted
>>> dotted.get({'a': ['first', 'second', 'third']}, 'a[0]')
'first'
>>> dotted.get({'a': {'b': 'hello'}}, 'a["b"]')
'first'
Attr fields
An attr field is expressed by prefixing with @
. This will fetch data at that attribute.
You may wonder why have this when you can just as easily use standard python to access.
Two important reasons: nested expressions and patterns.
>>> import dotted, types
>>> ns = types.SimpleNamespace
>>> ns.hello = {'me': 'goodbye'}
>>> dotted.get(ns, '@hello.me')
'goodbye'
Numeric types
The parser will attempt to interpret a field numerically if it can, such as field.1
will interpret the 1
part numerically.
>>> import dotted
>>> dotted.get({'7': 'me', 7: 'you'}, '7')
'you'
Quoting
Sometimes you need to quote a field which you can do by just putting the field in quotes.
>>> import dotted
>>> dotted.get({'has . in it': 7}, '"has . in it"')
7
The numericize #
operator
Non-integer numeric fields may be interpreted incorrectly if they have decimal point. To
solve, use the numerize operator #
at the front of a quoted field, such as #'123.45'
.
This will coerce to a numeric type (e.g. float).
>>> import dotted
>>> d = {'a': {1.2: 'hello', 1: {2: 'fooled you'}}}
>>> dotted.get(d, 'a.1.2')
'fooled you'
>>> dotted.get(d, 'a.#"1.2"')
'hello'
Slicing
Dotted slicing works like python slicing and all that entails.
>>> import dotted
>>> d = {'hi': {'there': [1, 2, 3]}, 'bye': {'there': [4, 5, 6]}}
>>> dotted.get(d, 'hi.there[::2]')
[1, 3]
>>> dotted.get(d, '*.there[1:]')
([2, 3], [5, 6])
The append +
operator
Both bracketed fileds and slices support the '+' operator which refers to the end of sequence. You may append an item or slice to the end a sequence.
>>> import dotted
>>> d = {'hi': {'there': [1, 2, 3]}, 'bye': {'there': [4, 5, 6]}}
>>> dotted.update(d, '*.there[+]', 8)
{'hi': {'there': [1, 2, 3, 8]}, 'bye': {'there': [4, 5, 6, 8]}}
>>> dotted.update(d, '*.there[+:]', [999])
{'hi': {'there': [1, 2, 3, 8, 999]}, 'bye': {'there': [4, 5, 6, 8, 999]}}
The append-unique +?
operator
If you want to update only unique items to a list, you can use the ?
postfix. This will ensure that it's only added once (see match-first below).
>>> import dotted
>>> items = [1, 2]
>>> dotted.update(items, '[+?]', 3)
[1, 2, 3]
>>> dotted.update(items, '[+?]', 3)
[1, 2, 3]
The invert -
operator
You can invert the meaning of the notation by prefixing a -
. For example,
to remove an item using update
:
>>> import dotted
>>> d = {'a': 'hello', 'b': 'bye'}
>>> dotted.update(d, '-b', dotted.ANY)
{'a': 'hello'}
>>> dotted.remove(d, '-b', 'bye again')
{'a': 'hello', 'b': 'bye again'}
Patterns
You may use dotted for pattern matching. You can match to wildcards or regular expressions. You'll note that patterns always return a tuple of matches.
>>> import dotted
>>> d = {'hi': {'there': [1, 2, 3]}, 'bye': {'there': [4, 5, 6]}}
>>> dotted.get(d, '*.there[2]')
(3, 6)
>>> dotted.get(d, '/h.*/.*')
([1, 2, 3],)
Dotted will return all values that match the pattern(s).
Wildcards
The wildcard pattern is *
. It will match anything.
Regular expressions
The regex pattern is enclosed in slashes: /regex/
. Note that if the field is a non-str,
the regex pattern will internally match to its str representation.
The match-first operatoer
You can also postfix any pattern with a ?
. This will return only
the first match.
>>> import dotted
>>> d = {'hi': {'there': [1, 2, 3]}, 'bye': {'there': [4, 5, 6]}}
>>> dotted.get(d, '*?.there[2]')
(3,)
Transforms
You can optionally add transforms to the end of dotted notation. These will
be applied on get
and update
. Transforms are separated by the |
operator
and multiple may be chained together. Transforms may be parameterized using
the :
operator.
>>> import dotted
>>> d = [1, '2', 3]
>>> dotted.get(d, '[1]')
'2'
>>> dotted.get(d, '[1]|int')
2
>>> dotted.get(d, '[0]|str:number=%d')
'number=1'
You may register new transforms via either register
or the @transform
decorator. Look at transforms.py for preregistered.
Filters
The key-value filter
You may filter by key-value to narrow your result set. You may use with key or
bracketed fields. Key-value fields may be disjunctively (OR) specified via the ,
delimiter.
A key-value field on key field looks like: keyfield.key1=value1,key2=value2...
.
This will return all key-value matches on a subordinate dict-like object. For example,
>>> d = {
... 'a': {
... 'id': 1,
... 'hello': 'there',
... },
... 'b': {
... 'id': 2,
... 'hello': 'there',
... },
... }
>>> dotted.get(d, '*.id=1')
({'id': 1, 'hello': 'there'},)
>>> dotted.get(d, '*.id=*')
({'id': 1, 'hello': 'there'}, {'id': 2, 'hello': 'there'})
A key-value field on a bracketed field looks like: [key1=value1,key2=value2...]
.
This will return all items in a list that match key-value filter. For example,
>>> d = {
... 'a': [{'id': 1, 'hello': 'there'}, {'id': 2, 'hello': 'there'}],
... 'b': [{'id': 3, 'hello': 'there'}, {'id': 4, 'hello': 'bye'}],
... }
>>> dotted.get(d, 'a[hello="there"][*].id')
(1, 2)
>>> dotted.get(d, '*[hello="there"][*].id')
r == (1, 2, 3)
The key-value first filter
You can have it match first by appending a ?
to the end of the filter.
>>> d = {
... 'a': [{'id': 1, 'hello': 'there'}, {'id': 2, 'hello': 'there'}],
... 'b': [{'id': 3, 'hello': 'there'}, {'id': 4, 'hello': 'bye'}],
... }
>>> dotted.get(d, 'a[hello="there"?]')
return [{'id': 1, 'hello': 'there'}]
Conjunction vs disjunction
To conjunctively connect filters use the .
operator. Filters offer the ability to act
disjunctively as well by using the ,
operator.
For example, given
*.key1=value1,key2=value2.key3=value3
. This will filter
(key1=value1
OR key2=value2
) AND key3=value3
.
Note that this gives you the abilty to have a key filter multiple values, such as:
*.key1=value1,key2=value2
.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file dotted-notation-0.11.0.tar.gz
.
File metadata
- Download URL: dotted-notation-0.11.0.tar.gz
- Upload date:
- Size: 22.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e1a214cb236cd647e56c79108e8eca2a50a635b7196077eab211869d9a93b0ed |
|
MD5 | 5746f9b9dfd7e633fa51365437bf395d |
|
BLAKE2b-256 | d600fdac253b4451c7a0a290922656aeff945db0faf341b79778ed11d3a6c1d8 |