Transform nested data structures into Python objects
Project description
Installation
pip install data2objects or just copy data2objects.py into your project.
Examples
The best way to explain the use of data2objects is via an example. Consider the following config.yaml file:
backbone:
activation: +torch.nn.SiLU()
hidden_size: 1024
readout:
+torch.nn.Linear:
in_features: =/backbone/hidden_size
out_features: 1
Parsing this file using data2objects.from_yaml returns the following:
>>> import data2objects
>>> config = data2objects.from_yaml("config.yaml")
>>> print(config)
{'backbone': {'activation': SiLU(), 'hidden_size': 1024},
'readout': Linear(in_features=1024, out_features=1, bias=True)}
Under-the-hood, data2objects has done the following:
- identified any "reference strings" prefixed by
"="and replaced them with the corresponding values in the nested data structure- hence
=/backbone/hidden_sizewas replaced with1024.
- hence
- identified any "object instantiation strings" prefixed by
"+", imported the corresponding objects from the provided modules and:- called the object if the instantiation string ends with
"()", i.e."+torch.nn.SiLU()"created aSiLUobject. - called the object with keyword arguments if the instantiation string ends with a mapping, i.e.
"+torch.nn.Linear: {in_features: =/backbone/hidden_size, out_features: 1}"created aLinearobject within_features=1024andout_features=1.
- called the object if the instantiation string ends with
Documentation
data2objects exposes two functions, from_dict and from_yaml, which can be used to transform a nested data structure into a set of instantiated Python objects.
from_yaml
def from_yaml(thing: str | Path, modules: list[object] | None = None) -> dict:
Load a nested dictionary from a yaml file or string, and parse it using
data2objects.from_dict.If
thingpoints to an existing file, the data in the file is loaded. Otherwise, the string is treated as containing the raw yaml data.Parameters
thing:
str | PathThe yaml file or string to load.modules:
list[object] | NoneA list of modules to look up non-fully qualified names in.
Returns
dictThe transformed data.
from_dict
def from_dict(
data: dict[K, V], modules: list[object] | None = None
) -> dict[K, V | Any]:
Transform a nested
datastructure into instantiated Python objects. This function recursively processes the input data, and applies the following special handling to anystrobjects:Reference handling:
Any leaf-nodes within
datathat are strings and start with"="are interpreted as references to other parts ofdata. The syntax for these references follows the same rules as unix paths:
"=/path": resolvepathrelative to the root of thedatastructure."=./path": resolvepathrelative to the current working directory."=../path": resolvepathrelative to the parent of the current working directory.Object instantiation:
The following handling applied to any
strobjects found withindata( either as a key or value) that start with"+":
- attempt to import the python object specified by the string: e.g. the string
"+torch.nn.Tanh"will be converted to theTanhclass (not an instance) from thetorch.nnmodule. If the string is not an absolute path (i.e. does not contain any dots), we attempt to import it from the python standard library, or any of the provided modules:
"+Path"withmodules=[pathlib]will be converted to thePathclass from thepathlibmodule."+tuple"will be converted to thetupletype.- if the string ends with a
"()", the resulting object is called with no arguments e.g."+my_module.MyClass()"will be converted to an instance ofMyClassfrommy_module. This is equivalent to+my_module.MyClass: {}(see below).- if the string is found as key in a mapping with exactly one key-value pair, then:
- if the value is itself a mapping, the single-item mapping is replaced with the result of calling the imported object with the recursively instantiated values as keyword arguments
- otherwise, the single-item mapping is replaced with the result of calling the imported object with the instantiated value as a single positional argument
Parameters
data:
dict[K, V]The data to transform.modules:
list[object] | NoneA list of modules to look up non-fully qualified names in.Returns
dictThe transformed data.Examples
A basic example:
>>> instantiate_from_data({"activation": "+torch.nn.Tanh()"}) {'activation': Tanh()}Note the importance of trailing parentheses:
>>> instantiate_from_data({"activation": "+torch.nn.Tanh"}) {'activation': <class 'torch.nn.modules.activation.Tanh'>}Alternatively, point
instantiate_from_datato automatically import fromtorch.nn:>>> instantiate_from_data({"activation": "+Tanh()"}, modules=[torch.nn]) {'activation': Tanh()}Use single-item mappings to instantiate classes/call functions with arguments. The following syntax will internally import
MyClassfrommy_module, and call it asMyClass(x=1, y=2)with explicit keyword arguments:>>> instantiate_from_data({ ... "activation": "+torch.nn.ReLU()", ... "model": { ... "+MyClass": {"x": 1, "y": 2} ... } ... }) {'activation': ReLU(), 'model': MyClass(x=1, y=2)}In contrast, the following syntax call the imported objects with a single positional argument:
>>> instantiate_from_data({"+len": [1, 2, 3]}) 3 # i.e. len([1, 2, 3])Mapping with multiple keys are still processed, but are never used to instantiate classes/call functions:
>>> instantiate_from_data({"+len": [1, 2, 3], "+print": "hello"}) {<built-in function len>: [1, 2, 3], <built-in function print>: 'hello'}
instantiate_from_dataalso works with arbitrary nesting:>>> instantiate_from_data({"model": {"activation": "+torch.nn.Tanh()"}}) {'model': {'activation': Tanh()}}Caution:
instantiate_from_datacan lead to side-effects!>>> instantiate_from_data({"+print": "hello"}) helloReferences are resolved before object instantiation, so all of the following will resolve the
"length"field to3:>>> instantiate_from_data({"args": [1, 2, 3], "length": {"+len": "!../args"}}) 3 >>> instantiate_from_data({"args": [1, 2, 3], "length": {"+len": "!~args"}}) 3
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file data2objects-0.1.0.tar.gz.
File metadata
- Download URL: data2objects-0.1.0.tar.gz
- Upload date:
- Size: 9.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9c0bb1d918c83e107aac897f00df69033d173c968972d68b776c280b04aea18c
|
|
| MD5 |
b2397b76e7728ff55460ea2fd1d81bff
|
|
| BLAKE2b-256 |
caf4099e4b283b971b82be2d0e52bcdc718c55706ff5e09ceeae34e7a8bbd7cd
|
Provenance
The following attestation bundles were made for data2objects-0.1.0.tar.gz:
Publisher:
publish.yml on jla-gardner/data2objects
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
data2objects-0.1.0.tar.gz -
Subject digest:
9c0bb1d918c83e107aac897f00df69033d173c968972d68b776c280b04aea18c - Sigstore transparency entry: 154418808
- Sigstore integration time:
-
Permalink:
jla-gardner/data2objects@8cf03f23edb42ab5d219af9f817f644c44b971c0 -
Branch / Tag:
refs/tags/0.1.0 - Owner: https://github.com/jla-gardner
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@8cf03f23edb42ab5d219af9f817f644c44b971c0 -
Trigger Event:
push
-
Statement type:
File details
Details for the file data2objects-0.1.0-py3-none-any.whl.
File metadata
- Download URL: data2objects-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
21c2ed613651895064d31bd881dabcf56b0a342fde513208518745498b33d9a6
|
|
| MD5 |
8115ff21fbe2f4ba5089ad626fccf6f4
|
|
| BLAKE2b-256 |
1cc4f418f0c12bb7badadec326f3609464cfad71f5c8f2378074979b4ae39017
|
Provenance
The following attestation bundles were made for data2objects-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on jla-gardner/data2objects
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
data2objects-0.1.0-py3-none-any.whl -
Subject digest:
21c2ed613651895064d31bd881dabcf56b0a342fde513208518745498b33d9a6 - Sigstore transparency entry: 154418811
- Sigstore integration time:
-
Permalink:
jla-gardner/data2objects@8cf03f23edb42ab5d219af9f817f644c44b971c0 -
Branch / Tag:
refs/tags/0.1.0 - Owner: https://github.com/jla-gardner
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@8cf03f23edb42ab5d219af9f817f644c44b971c0 -
Trigger Event:
push
-
Statement type: