Skip to main content

Dyanmic pydantic models

Project description

dynapydantic

CI Pre-commit Docs PyPI - Version Coverage Status Conda Version

dynapydantic is an extension to the pydantic Python package that allow for dynamic tracking of pydantic.BaseModel subclasses.

Installation

This project can be installed via PyPI:

pip install dynapydantic

or with conda via the conda-forge channel:

conda install dynapydantic

Motiviation

Consider the following simple class setup:

import pydantic

class Base(pydantic.BaseModel):
    pass

class A(Base):
    field: int

class B(Base):
    field: str

class Model(pydantic.BaseModel):
    val: Base

As expected, we can use A's and B's for Model.val:

>>> m = Model(val=A(field=1))
>>> m
Model(val=A(field=1))

However, we quickly run into trouble when serializing and validating:

>>> m.model_dump()
{'base': {}}
>>> m.model_dump(serialize_as_any=True)
{'val': {'field': 1}}
>>> Model.model_validate(m.model_dump(serialize_as_any=True))
Model(val=Base())

Pydantic provides a solution for serialization via serialize_as_any (and its corresponding field annotation SerializeAsAny), but offers no native solution for the validation half. Currently, the canonical way of doing this is to annotate the field as a union of all subclasses. Often, a single field in the model is chosen as the "discriminator" in a discriminated union. The discriminated pattern is the most robust way to do this, as it eliminates ambiguity between the union members. This library, dynapydantic, automates this process.

Let's reframe the above problem with dynapydantic:

import dynapydantic
import pydantic

class Base(
    dynapydantic.SubclassTrackingModel,
    discriminator_field="name",
    discriminator_value_generator=lambda t: t.__name__,
):
    pass

class A(Base):
    field: int

class B(Base):
    field: str

class Model(pydantic.BaseModel):
    val: dynapydantic.Polymorphic[Base]

Now, the same set of operations works as intended:

>>> m = Model(val=A(field=1))
>>> m
Model(val=A(field=1, name='A'))
>>> m.model_dump()
{'val': {'field': 1, 'name': 'A'}}
>>> Model.model_validate(m.model_dump())
Model(val=A(field=1, name='A')

How it works

TrackingGroup

The core entity in this library is the dynapydantic.TrackingGroup:

import typing as ty

import dynapydantic
import pydantic

mygroup = dynapydantic.TrackingGroup(
    name="mygroup",
    discriminator_field="name"
)

@mygroup.register("A")
class A(pydantic.BaseModel):
    """A class to be tracked, will be tracked as "A"."""
    a: int

@mygroup.register()
class B(pydantic.BaseModel):
    """Another class, will be tracked as "B"."""
    name: ty.Literal["B"] = "B"
    a: int

class Model(pydantic.BaseModel):
    """A model that can have A or B"""
    field: mygroup.union()  # call after all subclasses have been registered

print(Model(field={"name": "A", "a": 4})) # field=A(a=4, name='A')
print(Model(field={"name": "B", "a": 5})) # field=B(name='B', a=5)

The union() method produces a discriminated union of all registered pydantic.BaseModel subclasses. It also accepts an plain=True keyword argument to produce a plain UnionType for use in type annotations, but since this is a runtime-computed union, this will not work with static type checkers. This union is based on a discriminator field, which was configured by the discriminator_field argument to TrackingGroup. The field can be created by hand, as was shown with B, or dynapydantic will inject it for you, as was shown with A.

TrackingGroup has a few opt-in features to make it more powerful and easier to use:

  1. discriminator_value_generator: This parameter is a optional callback function that is called with each class that gets registered and produces a default value for the discriminator field. This allows the user to call register() without a value for the discriminator. For example, passing: lambda cls: cls.__name__ would use the name of the class as the discriminator value.
  2. plugin_entry_point: This parameter indicates to dynapydantic that there might be models to be discovered in other packages. Packages are discovered by the Python entrypoint mechanism. See the tests/example directory for an example of how this works.

SubclassTrackingModel

The most common use case of this pattern is to automatically register subclasses of a given pydantic.BaseModel. This is supported via the use of dynapydantic.SubclassTrackingModel. For example:

import typing as ty

import dynapydantic
import pydantic

class Base(
    dynapydantic.SubclassTrackingModel,
    discriminator_field="name",
    discriminator_value_generator=lambda cls: cls.__name__,
):
    """Base model, will track its subclasses"""

    # The TrackingGroup can be specified here like model_config, or passed in
    # kwargs of the class declaration, just like how model_config works with
    # pydantic.BaseModel. If you do it like this, you have to give the tracking
    # group a name, whereas using kwargs will generate the name for you.
    # tracking_config: ty.ClassVar[dynapydantic.TrackingGroup] = dynapydantic.TrackingGroup(
    #     name="BaseSubclasses",
    #     discriminator_field="name",
    #     discriminator_value_generator=lambda cls: cls.__name__,
    # )


class Intermediate(Base, exclude_from_union=True):
    """Subclasses can opt out of being tracked"""

class Derived1(Intermediate):
    """Non-direct descendants are registered"""
    a: int

class Derived2(Intermediate):
    """You can override the value generator if desired"""
    name: ty.Literal["Custom"] = "Custom"
    a: int

print(Base.registered_subclasses())
# {'Derived1': <class '__main__.Derived1'>, 'Custom': <class '__main__.Derived2'>}

# if plugin_entry_point was specificed, load plugin packages
# Base.load_plugins()

class Model(pydantic.BaseModel):
    """A model that can have any registered Base subclass"""
    field: dynapydantic.Polymorphic[Base]

print(Model(field={"name": "Derived1", "a": 4}))
# field=Derived1(a=4, name='Derived1')
print(Model(field={"name": "Custom", "a": 5}))
# field=Derived2(name='Custom', a=5)

It is important to note that the subclasses that are supported are those that were defined prior to defining the model that uses dynapydantic.Polymorphic (Model in the above example). If you declare additional subclasses afterwards, you must call .model_rebuild(force=True) on the model that uses the subclass union.

Alternative union methods

!!! warning "Caution"

`dynapydantic` does **NOT** test if your models have ambiguities in them.
This is up to **YOU**.

Non-discriminated unions should only be used when you can **PROVE** that all
possible subclasses will parse unambiguously. If there is ambiguity in the
models, you can get unexpected results. If plugins are used, it is highly
discouraged to use anything besides discriminated unions.

While the default discriminated union is the recommended and most robust approach, it does require a field in the model to act as the discriminator. If the full list of union members is known to the author ahead of time and can be proven to be unambiguous from a validation perspective, then the discriminator field can be omitted and a "smart" or "left_to_right" union may be used. TrackingGroup and SubclassTrackingModel support these modes as well via the union_mode argument:

import typing as ty

import dynapydantic
import pydantic

class Base(
    dynapydantic.SubclassTrackingModel,
    union_mode="smart",
):
    """dynapydantic.Polymorphic[Base] will be a "smart" A | B"""

class A(Base):
    a: int

class B(Base):
    b: int

class Model(pydantic.BaseModel):
    field: dynapydantic.Polymorphic[Base]

print(Model(field={"b": 5}))
# field=B(b=5)

When are unions realized?

When using TrackingModel directly, there is only one option for when the union is realized, which is the moment you call .union(). At this point, any classes that have been registered will be present in the union. Registration of additional classes after a call to .union() will not update the returned union from a previous call, so it is important to consider order of operations.

When using SubclassTrackingModel, there are more options and each comes with their own tradeoffs:

  1. Calling .union() directly: This functions exactly as it does with TrackingGroup. This option is the "most eager" option, but is the most sensitive with order of operations. In addition, type checkers will not understand this method, as they will complain about calling a function in a type annotation (rightfully so).

    Despite the tradeoffs, this option can be desireable for applications that inspect field annotations directly. This normally arises in user-implemented model reflection code and with pydantic_settings.

  2. Using dynapydantic.Polymorphic[T]: This method will defer the union realization slightly, into the schema generation step for the model. The difference between this and option 1 is slight and subtle, but does have an affect with recursive models. Consider the following:

    import dynapydantic
    import pydantic
    
    class Base(dynapydantic.SubclassTrackingModel, union_mode="smart"):
        pass
    
    class A(Base, extra="forbid"):
        a: int
    
    class B(Base, extra="forbid"):
        other: dynapydantic.Polymorphic[Base]
    
    B(other={"other": {"other": {"a": 2}}}) # ValidationError (union only has A)
    
    B.model_rebuild(force=True)
    B(other={"other": {"other": {"a": 2}}}) # B(other=B(other=B(other=A(a=2))))
    

    if we used Base.union() directly, the model_rebuild() call would do nothing, as the union had already been realized. To accomplish the same thing with .union(), we would have to use a forward reference, like "BUnion" then then call .union() right before the model_rebuild() calls.

    Unlike direct union() calls, the type checker can at least infer the field to be a subclass of Base, which is a vast improvement over a type error.

  3. EXPERIMENTAL Using implicit_polymorphic: If implicit_polymorphic=True is passed to a SubclassTrackingModel, then union realization is deferred to model validation time, making the process robust to order of operations. This reduces the previous example down to:

    import dynapydantic
    import pydantic
    
    class Base(
        dynapydantic.SubclassTrackingModel,
        union_mode="smart",
        implicit_polymorphic=True,
    ):
        pass
    
    class A(Base, extra="forbid"):
        a: int
    
    class B(Base, extra="forbid"):
        other: Base
    
    B(other={"other": {"other": {"a": 2}}}) # B(other=B(other=B(other=A(a=2))))
    

    This option has the cleanest syntax, but does incur a runtime penalty for potentially multiple schema compilations and the need for a field validator function, whereas options 1 and 2 can produce static schema. Like option 2, the field is able to be interpreted by type checkers as the base class.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dynapydantic-0.4.0.tar.gz (175.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dynapydantic-0.4.0-py3-none-any.whl (18.2 kB view details)

Uploaded Python 3

File details

Details for the file dynapydantic-0.4.0.tar.gz.

File metadata

  • Download URL: dynapydantic-0.4.0.tar.gz
  • Upload date:
  • Size: 175.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for dynapydantic-0.4.0.tar.gz
Algorithm Hash digest
SHA256 b6de96a4502021da7149f6ca22cb3d901b1e302add2e32b8c800c18ec985ac9b
MD5 3ec90189e636e0ae7c6169e14ca54949
BLAKE2b-256 6fcb249662876cc81615161a8835bdb4ba33e06f7d9705ef8ba1deebd59555d6

See more details on using hashes here.

Provenance

The following attestation bundles were made for dynapydantic-0.4.0.tar.gz:

Publisher: ci.yml on psalvaggio/dynapydantic

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dynapydantic-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: dynapydantic-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 18.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for dynapydantic-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2127772e5e151ec32cb41ce88b693d5d83a8bb5ea26807c34b0f6539bb546fe0
MD5 4b9051ecfd485f68aa6ffda3498ce1e4
BLAKE2b-256 c5e31af8a7224ebc3c7a6399974a7c9293b7f44dc33b5394902248f7e1a44db1

See more details on using hashes here.

Provenance

The following attestation bundles were made for dynapydantic-0.4.0-py3-none-any.whl:

Publisher: ci.yml on psalvaggio/dynapydantic

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page