Skip to main content

A library to make deserialization easy.

Project description

Code scanning - action PyPI version Azure DevOps builds

deserialize

A library to make deserialization easy. To get started, just run pip install deserialize

How it used to be

Without the library, if you want to convert:

{
    "a": 1,
    "b": 2
}

into a dedicated class, you had to do something like this:

class MyThing:

    def __init__(self, a, b):
        self.a = a
        self.b = b

    @staticmethod
    def from_json(json_data):
        a_value = json_data.get("a")
        b_value = json_data.get("b")

        if a_value is None:
            raise Exception("'a' was None")
        elif b_value is None:
            raise Exception("'b' was None")
        elif type(a_value) != int:
            raise Exception("'a' was not an int")
        elif type(b_value) != int:
            raise Exception("'b' was not an int")

        return MyThing(a_value, b_value)

my_instance = MyThing.from_json(json_data)

How it is now

With deserialize all you need to do is this:

import deserialize

class MyThing:
    a: int
    b: int

my_instance = deserialize.deserialize(MyThing, json_data)

That's it. It will pull out all the data and set it for you type checking and even checking for null values.

If you want null values to be allowed though, that's easy too:

class MyThing:
    a: int | None
    b: int | None

Now None is a valid value for these.

Supported Collection Types

The library natively supports the following collection types:

  • list - Ordered, mutable sequences

    items: list[int]  # [1, 2, 3]
    
  • dict - Key-value mappings

    mapping: dict[str, int]  # {"a": 1, "b": 2}
    
  • set - Unordered collections of unique elements (automatically removes duplicates)

    tags: set[str]  # {"python", "rust", "go"}
    
  • tuple - Immutable sequences

    • Fixed-length with heterogeneous types:
      coords: tuple[int, int]  # (10, 20)
      rgb: tuple[int, int, int]  # (255, 128, 0)
      
    • Variable-length with homogeneous types:
      values: tuple[int, ...]  # (1, 2, 3, 4, 5)
      

Collections can be nested arbitrarily. For example:

class Actor:
    name: str
    age: int

class Episode:
    title: str
    identifier: str
    actors: list[Actor]

class Season:
    episodes: list[Episode]
    completed: bool

class TVShow:
    seasons: list[Season]
    creator: str

Advanced Usage

Field Configuration with Annotated

You can use Field with Annotated type hints to configure field behavior. This is the recommended approach as it provides a modern, Pythonic API that's familiar to users of libraries like Pydantic and FastAPI.

from deserialize import Annotated, deserialize, Field

class User:
    user_id: Annotated[int, Field(alias="userId")]
    email: Annotated[str, Field(alias="emailAddress")]
    is_active: Annotated[bool, Field(default=True)]

data = {"userId": 123, "emailAddress": "user@example.com"}
user = deserialize(User, data)
# user.user_id = 123
# user.email = "user@example.com"  
# user.is_active = True (from default)

Field supports the following options:

  • alias: Alternative key name in source data

    user_id: Annotated[int, Field(alias="userId")]
    
  • default: Default value if field is missing

    is_active: Annotated[bool, Field(default=True)]
    
  • parser: Function to transform the value before assignment

    import datetime
    
    created_at: Annotated[datetime.datetime, Field(parser=datetime.datetime.fromtimestamp)]
    
  • ignore: Skip this field during deserialization

    internal_id: Annotated[str, Field(ignore=True)]
    

Benefits of using Field:

  • Configuration is co-located with the field definition
  • Better IDE autocomplete and type checking
  • More familiar to users of modern Python libraries
  • Easier to read - all field info in one place
  • Type-safe and validated at the point of declaration

Handling Different Key Names

Data often comes with keys that don't match Python naming conventions. Use Field(alias=...) to map between them:

from deserialize import Annotated, Field

class MyClass:
    # Map 'id' in data to 'identifier' in Python
    identifier: Annotated[str, Field(alias="id")]
    value: int

For automatic camelCase/PascalCase to snake_case conversion, use the auto_snake decorator:

from deserialize import Annotated, Field, auto_snake

@auto_snake()
class MyClass:
    some_integer: int
    some_string: str
    # Automatically maps "SomeInteger" and "SomeString" from data

Unhandled Fields

Usually, if you don't specify a field in your definition but it exists in the data, it will be ignored. If you want to be notified about extra fields, set throw_on_unhandled=True when calling deserialize(...):

# Will raise an exception if data has fields not defined in MyClass
result = deserialize(MyClass, data, throw_on_unhandled=True)

To explicitly allow specific fields to be unhandled, use the @allow_unhandled decorator:

@deserialize.allow_unhandled("metadata")
class MyClass:
    value: int

Ignored Fields

Some properties in your class may not come from the deserialized data. Mark them as ignored using Field(ignore=True):

from deserialize import Annotated, Field

class MyClass:
    value: int
    # This field won't be deserialized
    identifier: Annotated[str, Field(ignore=True)]

Value Transformation with Parsers

Transform values during deserialization using Field(parser=...). This is useful when the data format doesn't match your desired type:

from deserialize import Annotated, Field
import datetime

class Result:
    successful: bool
    # Convert Unix timestamp to datetime
    timestamp: Annotated[datetime.datetime, Field(parser=datetime.datetime.fromtimestamp)]

# Input: {"successful": True, "timestamp": 1543770752}
# result.timestamp will be a datetime object

The parser runs before type checking. If your field accepts None, ensure your parser handles it:

def parse_timestamp(value):
    if value is None:
        return None
    return datetime.datetime.fromtimestamp(value)

class Result:
    timestamp: Annotated[datetime.datetime | None, Field(parser=parse_timestamp)]

Subclassing

Subclassing is fully supported. Properties from parent classes are automatically included during deserialization:

class Shape:
    color: str

class Rectangle(Shape):
    width: int
    height: int

# Will deserialize both 'color' and rectangle-specific fields

Raw Data Storage

Keep a reference to the raw data used for construction by setting the raw_storage_mode parameter:

from deserialize import deserialize, RawStorageMode

result = deserialize(MyClass, data, raw_storage_mode=RawStorageMode.ROOT)
# Access via: result.__deserialize_raw__

Options:

  • RawStorageMode.ROOT: Store raw data only on the root object
  • RawStorageMode.ALL: Store raw data on all objects in the tree

Default Values

Provide default values for missing fields using Field(default=...):

from deserialize import Annotated, Field

class IntResult:
    successful: bool
    value: Annotated[int, Field(default=0)]

# Input: {"successful": True}
# result.value will be 0

Note: Defaults only apply when the field is missing from the data. If the field is present with value None, it will fail unless the type allows None.

Post-processing

Not everything can be set on your data straight away. Some things need to be figured out afterwards. For this you need to do some post-processing. The easiest way to do this is through the @constructed decorator. This decorator takes a function which will be called whenever a new instance is constructed with that instance as an argument. Here's an example which converts polar coordinates from using degrees to radians:

data = {
    "angle": 180.0,
    "magnitude": 42.0
}

def convert_to_radians(instance):
    instance.angle = instance.angle * math.pi / 180

@deserialize.constructed(convert_to_radians)
class PolarCoordinate:
    angle: float
    magnitude: float

pc = deserialize.deserialize(PolarCoordinate, data)

print(pc.angle, pc.magnitude)

>>> 3.141592653589793 42.0

Downcasting

Data often comes in the form of having the type as a field in the data. This can be difficult to parse. For example:

data = [
    {
        "data_type": "foo",
        "foo_prop": "Hello World",
    },
    {
        "data_type": "bar",
        "bar_prop": "Goodbye World",
    }
]

Since the fields differ between the two, there's no good way of parsing this data. You could use optional fields on some base class, try multiple deserializations until you find the right one, or do the deserialization based on a mapping you build of the data_type field. None of those solutions are elegant though, and all have issues if the types are nested. Instead, you can use the downcast_field and downcast_identifier decorators.

downcast_field is specified on a base class and gives the name of the field that contains the type information. downcast_identifier takes in a base class and an identifier (which should be one of the possible values of the downcast_field from the base class). Internally, when a class with a downcast field is detected, the field will be extacted, and a subclass with a matching identifier will be searched for. If no such class exists, an UndefinedDowncastException will be thrown.

Here's an example which would handle the above data:

@deserialize.downcast_field("data_type")
class MyBase:
    type_name: str


@deserialize.downcast_identifier(MyBase, "foo")
class Foo(MyBase):
    foo_prop: str


@deserialize.downcast_identifier(MyBase, "bar")
class Bar(MyBase):
    bar_prop: str


result = deserialize.deserialize(list[MyBase], data)

Here, result[0] will be an instance of Foo and result[1] will be an instance of Bar.

If you can't describe all of your types, you can use @deserialize.allow_downcast_fallback on your base class and any unknowns will be left as dictionaries.

Custom Deserializing

If none of the above work for you, sometimes there's no choice but to turn to customized deserialization code. To do this is very easy. Simply implement the CustomDeserializable protocol, and add the deserialize method to your class like so:

class MyObject(deserialize.CustomDeserializable):

    name: str
    age: int

    def __init__(self, name: str, age: int) -> None:
        self.name = name
        self.age = age

    @classmethod
    def deserialize(cls, value: Any) -> "MyObject":
        assert isinstance(value, list)
        assert len(value) == 2

        return cls(value[0], value[1])

Normally you'd use a dictionary to create an object (something like {"name": "Hodor", "age": 42}), but this now allows us to use a list. i.e. my_instance = deserialize.deserialize(MyObject, ["Hodor", 42])

No type checking is done on the result or input. It's entirely on the implementer at this point.


Deprecated Features

The following decorator-based features are deprecated and maintained only for backward compatibility. New code should use Field with Annotated instead.

Deprecated: @key decorator

❌ Old way (deprecated):

@deserialize.key("identifier", "id")
class MyClass:
    identifier: str

✅ New way:

from deserialize import Annotated, Field

class MyClass:
    identifier: Annotated[str, Field(alias="id")]

Deprecated: @default decorator

❌ Old way (deprecated):

@deserialize.default("value", 0)
class MyClass:
    value: int

✅ New way:

from deserialize import Annotated, Field

class MyClass:
    value: Annotated[int, Field(default=0)]

Deprecated: @parser decorator

❌ Old way (deprecated):

@deserialize.parser("timestamp", datetime.datetime.fromtimestamp)
class Result:
    timestamp: datetime.datetime

✅ New way:

from deserialize import Annotated, Field
import datetime

class Result:
    timestamp: Annotated[datetime.datetime, Field(parser=datetime.datetime.fromtimestamp)]

Deprecated: @ignore decorator

❌ Old way (deprecated):

@deserialize.ignore("identifier")
class MyClass:
    identifier: str

✅ New way:

from deserialize import Annotated, Field

class MyClass:
    identifier: Annotated[str, Field(ignore=True)]

Note: If both Field and decorators are used for the same field, Field takes precedence.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deserialize-2.3.0.tar.gz (21.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deserialize-2.3.0-py3-none-any.whl (23.0 kB view details)

Uploaded Python 3

File details

Details for the file deserialize-2.3.0.tar.gz.

File metadata

  • Download URL: deserialize-2.3.0.tar.gz
  • Upload date:
  • Size: 21.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.12.12 Darwin/25.1.0

File hashes

Hashes for deserialize-2.3.0.tar.gz
Algorithm Hash digest
SHA256 2c2cdc542aedc460bb9a533bb063b81efb4810d212a3c5361da3a01713fbb1b5
MD5 3440dd8a4466b7c1d103c13999ec06a3
BLAKE2b-256 aa833e33c3e6c73afa6b855d2aa479b0a37a6db5f39dea675696bb4d633e65dc

See more details on using hashes here.

File details

Details for the file deserialize-2.3.0-py3-none-any.whl.

File metadata

  • Download URL: deserialize-2.3.0-py3-none-any.whl
  • Upload date:
  • Size: 23.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.12.12 Darwin/25.1.0

File hashes

Hashes for deserialize-2.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0b272738e5db0e33bac2ed927ea06df1178fe1648b7bf05e29cbd95c976e749d
MD5 9c4764f3907379d52e69d5472c92b3f2
BLAKE2b-256 299caa74d3893456aa9e1aad3bd591b14a62f3b090cacf63e2d9e37a71a3d6b0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page