Transform nested data structures into Python objects
Project description
data2objects
Transform nested data structures into Python objects.
Installation
pip install data2objects
or just copy data2objects.py into your project.
Usage
data2objects is intended to be used alongside config files, e.g. .yamls:
backbone:
activation: +torch.nn.SiLU()
hidden_size: 1024
readout:
+torch.nn.Linear:
in_features: '!~/backbone/hidden_size'
out_features: 1
Hand off the dict-like structure returned by yaml.safe_load to data2objects.from_dict to:
- resolve references prefixed by
"!"(using both global"!~/a/b/c"and relative"!../../d/e"paths) - import any objects/classes/functions prefixed by
"+" - call any functions/classes as appropriate:
- using no arguments for any
strprefixed by"+"ending with"()" - using keyword arguments for any
strprefixed by"+"found as a key in a mapping with exactly one key-value pair - using a single positional argument for any
strprefixed by"+"found as a key in a mapping with exactly one key-value pair
- using no arguments for any
(see documentation below for more details)
import yaml # pip install pyyaml if necessary
from data2objects import from_dict
with open("config.yaml") as f:
data = yaml.safe_load(f)
config = from_dict(data)
print(config)
{'backbone': {'activation': SiLU(), 'hidden_size': 1024},
'readout': Linear(in_features=1024, out_features=1, bias=True)}
Combine with the fantastic dacite library to instantiate nested dataclasses as config objects:
from dataclasses import dataclass
import dacite
import torch
@dataclass
class Backbone:
activation: torch.nn.Module
hidden_size: int
@dataclass
class Config:
backbone: Backbone
readout: torch.nn.Module
final_config = dacite.from_dict(Config, config)
print(final_config)
Config(
backbone=Backbone(activation=SiLU(), hidden_size=1024),
readout=Linear(in_features=1024, out_features=1, bias=True)
)
Documentation
data2objects exposes a single function, from_dict, which can be used to transform a nested data structure into a set of instantiated Python objects:
def from_dict(
data: dict[K, V], modules: list[object] | None = None
) -> dict[K, V | Any]:
Transform a nested
datastructure into instantiated Python objects.This function recursively processes the input data, and applies the following special handling to any
strobjects:Reference handling:
Any leaf-nodes within
datathat are strings and start with"!"are interpreted as references to other parts ofdata. The following syntax is supported:
"~path": resolvepathrelative to the root of thedatastructure."path": resolvepathrelative to the current location."../path": resolvepathrelative to the parent of the current location.- and so on like normal unix paths
Object instantiation:
The following handling applied to any
strobjects found withindata( either as a key or value) that start with"+":
- attempt to import the python object specified by the string: e.g. the string
"+torch.nn.Tanh"will be converted to theTanhclass (not an instance) from thetorch.nnmodule. If the string is not an absolute path (i.e. does not contain any dots), we attempt to import it from the python standard library, or any of the provided modules:
"+Path"withmodules=[pathlib]will be converted to thePathclass from thepathlibmodule."+tuple"will be converted to thetupletype.- if the string ends with a
"()", the resulting object is called with no arguments e.g."+my_module.MyClass()"will be converted to an instance ofMyClassfrommy_module.- if the string is found as key in a mapping with exactly one key-value pair, then:
- if the value is itself a mapping, the single-item mapping is replaced with the result of calling the imported object with the recursively instantiated values as keyword arguments
- otherwise, the single-item mapping is replaced with the result of calling the imported object with the instantiated value as a single positional argument
Parameters
dataThe data to transform.
modulesA list of modules to look up non-fully qualified names in.Returns
dictThe transformed data.Examples
A basic example:
>>> from_dict({"activation": "+torch.nn.Tanh()"}) {'activation': Tanh()}Note the importance of trailing parentheses:
>>> from_dict({"activation": "+torch.nn.Tanh"}) {'activation': <class 'torch.nn.modules.activation.Tanh'>}Alternatively, point
from_dictto automatically import fromtorch.nn:>>> from_dict({"activation": "+Tanh()"}, modules=[torch.nn]) {'activation': Tanh()}Use single-item mappings to instantiate classes/call functions with arguments. The following syntax will internally import
MyClassfrommy_module, and call it asMyClass(x=1, y=2)with explicit keyword arguments:>>> from_dict({ ... "activation": "+torch.nn.ReLU()", ... "model": { ... "+MyClass": {"x": 1, "y": 2} ... } ... }) {'activation': ReLU(), 'model': MyClass(x=1, y=2)}In contrast, the following syntax call the imported objects with a single positional argument:
>>> from_dict({"+len": [1, 2, 3]}) 3 # i.e. len([1, 2, 3])Mappings with multiple keys are still processed, but are never used to instantiate classes/call functions:
>>> from_dict({"+len": [1, 2, 3], "+print": "hello"}) {<built-in function len>: [1, 2, 3], <built-in function print>: 'hello'}
from_dictalso works with arbitrary nesting:>>> from_dict({"model": {"activation": "+torch.nn.Tanh()"}}) {'model': {'activation': Tanh()}}Caution:
from_dictcan lead to side-effects!>>> from_dict({"+print": "hello"}) helloReferences are resolved before object instantiation, so all of the following will resolve the
"length"field to3:>>> from_dict({"args": [1, 2, 3], "length": {"+len": "!../args"}}) 3 >>> from_dict({"args": [1, 2, 3], "length": {"+len": "!~args"}}) 3
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file data2objects-0.0.1.tar.gz.
File metadata
- Download URL: data2objects-0.0.1.tar.gz
- Upload date:
- Size: 7.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.8.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b562a4b3cba435be23b9e2a900c352543275770ad981c80395c191afd9836d63
|
|
| MD5 |
a0b56ac91e5e3371d348bfbf7a1479b5
|
|
| BLAKE2b-256 |
7d7c0dde54929f428c1d623f9e6b41e9b1484f5ffbd487f82bcdee788ae149fe
|
File details
Details for the file data2objects-0.0.1-py3-none-any.whl.
File metadata
- Download URL: data2objects-0.0.1-py3-none-any.whl
- Upload date:
- Size: 7.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.8.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
61dcea6310a178342a81c24a0c51f4925ea290b7ea2da4090bbd9b0397c8fd9b
|
|
| MD5 |
05a16b28ff718e0ef2f8b7dc50ab341a
|
|
| BLAKE2b-256 |
adb03bb9facbcfb10f27cf3e5bd552a17c6b53c5b9c2b868256b56301f8f0567
|