Simple creation of dataclasses from JSON
Project description
dataglasses
A small package to simplify creating dataclasses from JSON and validating that JSON.
Installation
$ pip install dataglasses
Requirements
Requires Python 3.10 or later.
If you wish to validate arbitrary JSON data against the generated JSON schemas in Python, consider installing jsonschema, though this is unnecessary when using dataglasses to convert JSON into dataclasses.
Quick start
>>> from dataclasses import dataclass
>>> from dataglasses import from_dict, to_json_schema
>>> from json import dumps
>>> @dataclass
... class InventoryItem:
... name: str
... unit_price: float
... quantity_on_hand: int = 0
>>> from_dict(InventoryItem, { "name": "widget", "unit_price": 3.0})
InventoryItem(name='widget', unit_price=3.0, quantity_on_hand=0)
>>> print(dumps(to_json_schema(InventoryItem), indent=2))
print output...
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$ref": "#/$defs/InventoryItem",
"$defs": {
"InventoryItem": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"unit_price": {
"type": "number"
},
"quantity_on_hand": {
"type": "integer",
"default": 0
}
},
"required": [
"name",
"unit_price"
]
}
}
}
Objective
The purpose of this library is to speed up rapid development by making it trivial to populate dataclasses with dictionary data extracted from JSON (or elsewhere), as well as to perform basic validation on that data. The library contains just one file and two functions, so can even be directly copied into a project.
It is not intended for complex validation or high performance. For those, consider using pydantic.
Usage
The package contains just two functions:
def from_dict(
cls: type[T],
value: Any,
*,
strict: bool = False,
transform: Optional[TransformRules] = None,
local_refs: Optional[set[type]] = None,
) -> T
This converts a nested dictionary value of input data into the given dataclass type cls, raising an exception if the conversion is not possible. (The optional keyword arguments are described further down.)
def to_json_schema(
cls: type,
*,
strict: bool = False,
transform: Optional[TransformRules] = None,
local_refs: Optional[set[type]] = None,
) -> dict[str, Any]:
This generates a 2020-12 JSON schema representing valid inputs for the dataclass type cls, raising an exception if the class cannot be represented in JSON. (Again, the optional keyword arguments are described further down.)
Below is a summary of the different supported use cases:
Nested structures
Dataclasses can be nested, using either global or local definitions.
>>> @dataclass
... class TrackedItem:
...
... @dataclass
... class GPS:
... lat: float
... long: float
...
... item: InventoryItem
... location: GPS
>>> from_dict(TrackedItem, {
... "item": { "name": "pie", "unit_price": 42},
... "location": { "lat": 52.2, "long": 0.1 } })
TrackedItem(item=InventoryItem(name='pie', unit_price=42, quantity_on_hand=0),
location=TrackedItem.GPS(lat=52.2, long=0.1))
>>> print(dumps(to_json_schema(TrackedItem), indent=2))
print output...
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$ref": "#/$defs/TrackedItem",
"$defs": {
"TrackedItem": {
"type": "object",
"properties": {
"item": {
"$ref": "#/$defs/InventoryItem"
},
"location": {
"$ref": "#/$defs/TrackedItem.GPS"
}
},
"required": [
"item",
"location"
]
},
"InventoryItem": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"unit_price": {
"type": "number"
},
"quantity_on_hand": {
"type": "integer",
"default": 0
}
},
"required": [
"name",
"unit_price"
]
},
"TrackedItem.GPS": {
"type": "object",
"properties": {
"lat": {
"type": "number"
},
"long": {
"type": "number"
}
},
"required": [
"lat",
"long"
]
}
}
}
Collection types
There is automatic support for the generic collection types most compatible with JSON: list[T], tuple[...] and Sequence[T] (encoded as arrays) and dict[str, T] and Mapping[str, T] (encoded as objects).
>>> from collections.abc import Mapping, Sequence
>>> @dataclass
... class Catalog:
... items: Sequence[InventoryItem]
... publisher: tuple[str, int]
... purchases: Mapping[str, int]
>>> from_dict(Catalog, {
... "items": [{ "name": "widget", "unit_price": 3.0}],
... "publisher": ["ACME", 1982],
... "purchases": { "Wile E. Coyote": 52}})
Catalog(items=[InventoryItem(name='widget', unit_price=3.0, quantity_on_hand=0)],
publisher=('ACME', 1982), purchases={'Wile E. Coyote': 52})
>>> print(dumps(to_json_schema(Catalog), indent=2))
print output...
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$ref": "#/$defs/Catalog",
"$defs": {
"Catalog": {
"type": "object",
"properties": {
"items": {
"type": "array",
"items": {
"$ref": "#/$defs/InventoryItem"
}
},
"publisher": {
"type": "array",
"prefixItems": [
{
"type": "string"
},
{
"type": "integer"
}
],
"minItems": 2,
"maxItems": 2
},
"purchases": {
"type": "object",
"patternProperties": {
"^.*$": {
"type": "integer"
}
}
}
},
"required": [
"items",
"publisher",
"purchases"
]
},
"InventoryItem": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"unit_price": {
"type": "number"
},
"quantity_on_hand": {
"type": "integer",
"default": 0
}
},
"required": [
"name",
"unit_price"
]
}
}
}
Unrestricted types like list or dict (or set or Any) and mappings with non-str keys can be used with from_dict but not with to_json_schema. Alternatively, these, alongside unsupported generic types like set[T], can be used with both from_dict and to_json_schema by defining an appropriate encoding transformation (see section below).
Optional and Union types
Union types (S | T or Union[S, T, ...]) are matched against all their permitted subtypes in order, returning the first successful match, or raising an exception if there are none. Optional types (T | None or Optional[T]) are handled similarly. Note that an optional type is not the same as an optional field (i.e. one with a default): a field with an optional type is still a required field unless it has a default value (which could be None but could also be something else).
>>> from typing import Optional
>>> @dataclass
... class ItemPurchase:
... items: Sequence[InventoryItem | TrackedItem]
... invoice: Optional[int] = None
>>> from_dict(ItemPurchase, {
... "items": [{
... "item": { "name": "pie", "unit_price": 42},
... "location": { "lat": 52.2, "long": 0.1 } }],
... "invoice": 1234})
ItemPurchase(items=[TrackedItem(item=
InventoryItem(name='pie', unit_price=42, quantity_on_hand=0),
location=TrackedItem.GPS(lat=52.2, long=0.1))], invoice=1234)
>>> print(dumps(to_json_schema(ItemPurchase), indent=2))
print output...
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$ref": "#/$defs/ItemPurchase",
"$defs": {
"ItemPurchase": {
"type": "object",
"properties": {
"items": {
"type": "array",
"items": {
"anyOf": [
{
"$ref": "#/$defs/InventoryItem"
},
{
"$ref": "#/$defs/TrackedItem"
}
]
}
},
"invoice": {
"anyOf": [
{
"type": "integer"
},
{
"type": "null"
}
],
"default": null
}
},
"required": [
"items"
]
},
"InventoryItem": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"unit_price": {
"type": "number"
},
"quantity_on_hand": {
"type": "integer",
"default": 0
}
},
"required": [
"name",
"unit_price"
]
},
"TrackedItem": {
"type": "object",
"properties": {
"item": {
"$ref": "#/$defs/InventoryItem"
},
"location": {
"$ref": "#/$defs/TrackedItem.GPS"
}
},
"required": [
"item",
"location"
]
},
"TrackedItem.GPS": {
"type": "object",
"properties": {
"lat": {
"type": "number"
},
"long": {
"type": "number"
}
},
"required": [
"lat",
"long"
]
}
}
}
Enum and Literal types
Both Enum and Literal types can be used to match explicit enumerations. By default, Enum types match both the values and symbolic names (preferring the former in case of a clash). This behaviour can be overridden using a transformation if desired (see section below).
>>> from enum import auto, StrEnum
>>> from typing import Literal
>>> class BuildType(StrEnum):
... DEBUG = auto()
... OPTIMIZED = auto()
>>> @dataclass
... class Release:
... build: BuildType
... approved: Literal["Yes", "No"]
>>> from_dict(Release, {"build": "debug", "confirmed": "Yes"})
Release(build=<Build.DEBUG: 'debug'>, approved='Yes')
>>> print(dumps(to_json_schema(Release), indent=2))
print output...
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$ref": "#/$defs/Release",
"$defs": {
"Release": {
"type": "object",
"properties": {
"build": {
"enum": [
"debug",
"optimized",
"DEBUG",
"OPTIMIZED"
]
},
"approved": {
"enum": [
"Yes",
"No"
]
}
},
"required": [
"build",
"confirmed"
]
}
}
}
Annotated types
Annotated types can be used to populate the property "description" annotations in the JSON schema.
>>> from typing import Annotated
>>> @dataclass
... class InventoryItem:
... name: Annotated[str, "item name"]
... unit_price: Annotated[float, "unit price"]
... quantity_on_hand: Annotated[int, "quantity on hand"] = 0
>>> from_dict(InventoryItem, { "name": "widget", "unit_price": 3.0})
InventoryItem(name='widget', unit_price=3.0, quantity_on_hand=0)
>>> print(dumps(to_json_schema(InventoryItem), indent=2))
print output...
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$ref": "#/$defs/InventoryItem",
"$defs": {
"InventoryItem": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "item name"
},
"unit_price": {
"type": "number",
"description": "unit price"
},
"quantity_on_hand": {
"type": "integer",
"description": "quantity on hand",
"default": 0
}
},
"required": [
"name",
"unit_price"
]
}
}
}
Forward references
Forward reference types (written as string literals or ForwardRef objects) are supported, permitting recursive dataclasses. Global and class-scoped references are handled automatically:
>>> @dataclass
... class Cons:
... head: "Head"
... tail: Optional["Cons"] = None
...
... @dataclass
... class Head:
... v: int
...
... def __repr__(self):
... return f"{self.head.v}::{self.tail}"
>>> from_dict(Cons, {"head": {"v": 1}, "tail": {"head": {"v": 2}}})
1::2::None
>> print(dumps(to_json_schema(Cons), indent=2))
print output...
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$ref": "#/$defs/Cons",
"$defs": {
"Cons": {
"type": "object",
"properties": {
"head": {
"$ref": "#/$defs/Cons.Head"
},
"tail": {
"anyOf": [
{
"$ref": "#/$defs/Cons"
},
{
"type": "null"
}
],
"default": null
}
},
"required": [
"head"
]
},
"Cons.Head": {
"type": "object",
"properties": {
"v": {
"type": "integer"
}
},
"required": [
"v"
]
}
}
}
Locally-scoped references, however, must be specified using the local_refs keyword:
>>> def reverse_cons(seq):
...
... @dataclass
... class Cons:
... head: int
... tail: Optional["Cons"] = None
...
... def __repr__(self):
... return f"{self.head}::{self.tail}"
...
... value = None
... for x in seq: value = { "head": x, "tail": value }
... return from_dict(Cons, value, local_refs={Cons})
>>> reverse_cons([1,2,3])
3::2::1::None
Strict mode
Both from_dict and to_json_schema default to ignoring additional properties that are not part of a dataclass (similar to additionalProperties defaulting to true in JSON schemas). This can be disabled with the strict keyword.
>>> value = { "name": "widget", "unit_price": 4.0, "comment": "too expensive"}
>>> from_dict(InventoryItem, value)
InventoryItem(name='widget', unit_price=4.0, quantity_on_hand=0)
>>> from_dict(InventoryItem, value, strict=True)
TypeError: Unexpected <class '__main__.InventoryItem'> fields {'comment'}
>>> print(dumps(to_json_schema(InventoryItem, strict=True), indent=2))
print output...
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$ref": "#/$defs/InventoryItem",
"$defs": {
"InventoryItem": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "item name"
},
"unit_price": {
"type": "number",
"description": "unit price"
},
"quantity_on_hand": {
"type": "integer",
"description": "quantity on hand",
"default": 0
}
},
"required": [
"name",
"unit_price"
],
"additionalProperties": false
}
}
}
Transformations
Transformations allow you to override the handling of specific types or dataclass fields, and can be used to normalise inputs or convert them into different types, including ones that aren't normally supported. Transformations are specified with the transform keyword, using a mapping:
- the mapping keys are either:
- a type used somewhere in the output dataclass: e.g.
strorset[int] - a dataclass field specified by a class-name tuple: e.g.
(InventoryItem, "name")or(Cons, "head")
- a type used somewhere in the output dataclass: e.g.
- the mapping values are a tuple consisting of:
- the JSON-serialisable input type that we want to represent this output type or field
- a callable function to convert from that input type to the output type
Note that the input type can be the same as the output type. Conversely, note that transformations don't help with serialising the dataclasses back into JSON from non-serialisable types.
>>> @dataclass
... class Person:
... name : str
... aliases: set[str]
>>> transform = {
... str: (str, str.title),
... set[str]: (list[str], set),
... (Person, "name"): (str, lambda s: s + "!")}
>>> from_dict(Person, {"name": "robert", "aliases": ["bob", "bobby"]}, transform=transform)
Person(name='Robert!', aliases={'Bobby', 'Bob'})
>>> print(dumps(to_json_schema(Person, transform=transform), indent=2))
print output...
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$ref": "#/$defs/Person",
"$defs": {
"Person": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"aliases": {
"type": "array",
"items": {
"type": "string"
}
}
},
"required": [
"name",
"aliases"
]
}
}
}
Contributions
Bug reports, feature requests and contributions are very welcome. Note that PRs must include tests with 100% code coverage and pass the necessary quality checks before they can be merged.
To run the tests, make sure you have uv installed, then type:
$ uv run task tests
To perform the formatting and linting checks, type:
$ uv run task check
To automatically resolve automatically fixable formatting and linting issues, type:
$ uv run task format
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dataglasses-0.8.0.tar.gz.
File metadata
- Download URL: dataglasses-0.8.0.tar.gz
- Upload date:
- Size: 33.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eca0b94a3e2444a2ec786a6e442de9472f27ac8654be9cd58b0b0a6a5610a946
|
|
| MD5 |
85c0d2514741058781dd83525a0b516d
|
|
| BLAKE2b-256 |
73a88c7a367d80fb4c0569bdb399e0e5863a666fb678cd20c356b3a123c0c880
|
Provenance
The following attestation bundles were made for dataglasses-0.8.0.tar.gz:
Publisher:
deploy.yaml on Udzu/dataglasses
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dataglasses-0.8.0.tar.gz -
Subject digest:
eca0b94a3e2444a2ec786a6e442de9472f27ac8654be9cd58b0b0a6a5610a946 - Sigstore transparency entry: 592890870
- Sigstore integration time:
-
Permalink:
Udzu/dataglasses@b61b9597dbb729dd911f740415fef32bfca42547 -
Branch / Tag:
refs/tags/v0.8.0 - Owner: https://github.com/Udzu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
deploy.yaml@b61b9597dbb729dd911f740415fef32bfca42547 -
Trigger Event:
push
-
Statement type:
File details
Details for the file dataglasses-0.8.0-py3-none-any.whl.
File metadata
- Download URL: dataglasses-0.8.0-py3-none-any.whl
- Upload date:
- Size: 10.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
019a4dd11d2fedc805971d4a4d7c45fb8d1888676c558717b819bb1fee27937c
|
|
| MD5 |
235e828457b497572367e5dd87a7c869
|
|
| BLAKE2b-256 |
79a7a59c29eea43d49c0da1ca468684ffd4a081c5e5d08df40e842852696910f
|
Provenance
The following attestation bundles were made for dataglasses-0.8.0-py3-none-any.whl:
Publisher:
deploy.yaml on Udzu/dataglasses
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dataglasses-0.8.0-py3-none-any.whl -
Subject digest:
019a4dd11d2fedc805971d4a4d7c45fb8d1888676c558717b819bb1fee27937c - Sigstore transparency entry: 592890926
- Sigstore integration time:
-
Permalink:
Udzu/dataglasses@b61b9597dbb729dd911f740415fef32bfca42547 -
Branch / Tag:
refs/tags/v0.8.0 - Owner: https://github.com/Udzu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
deploy.yaml@b61b9597dbb729dd911f740415fef32bfca42547 -
Trigger Event:
push
-
Statement type: