Skip to main content

Bidirectional bridge between Protocol Buffers and Pydantic: generate pydantic models from .proto files with lossless round-trips to proto messages and wire bytes

Project description

protodantic

CI PyPI Python License: MIT

Bidirectional bridge between Protocol Buffers and Pydantic.

Point it at your .proto files and it generates plain pydantic v2 models — with full validation — where every model round-trips losslessly to and from real protobuf messages, wire bytes, and proto JSON. The pydantic → proto direction is a first-class citizen: to_proto_bytes() produces genuine wire-format output that any protobuf consumer in any language can parse.

Install

uv add protodantic-py   # or: pip install protodantic-py

The distribution is named protodantic-py (the plain name is squatted on PyPI); the import stays protodantic:

import protodantic

Usage

Given demo.proto:

syntax = "proto3";
package demo;

message Address {
  string street = 1;
  string city = 2;
}

message User {
  int64 id = 1;
  string name = 2;
  Address address = 3;
  repeated string tags = 4;
  optional string nickname = 5;
}

Generate models:

protodantic generate demo.proto -o models.py

Then:

from models import User, Address

user = User(id=7, name="kory", address=Address(city="Warsaw"), tags=["a", "b"])

# pydantic -> proto: real wire format, readable by any protobuf runtime
data: bytes = user.to_proto_bytes()
msg = user.to_proto()          # a live protobuf Message
json_str = user.to_proto_json()  # canonical proto JSON

# proto -> pydantic: parse + validate in one step
restored = User.from_proto_bytes(data)
assert restored == user

Or drive it from Python:

from protodantic import compile_fdset, generate_source

source = generate_source(compile_fdset(["demo.proto"]))

Type mapping

proto pydantic
int32/64, uint32/64, sint, fixed range-validated int (out-of-range fails at construction)
float, double float
string / bytes / bool str / bytes / bool
enum generated OpenEnum (IntEnum that preserves unknown wire values — proto3 enums are open)
message generated ProtoModel (nested types flattened as Outer_Inner)
repeated T / map<K, V> list[T] / dict[K, V]
optional, oneof members, singular messages T | None (presence-aware: None ⇄ unset)
oneof groups mutual exclusion enforced by a model validator
google.protobuf.Timestamp datetime.datetime (UTC; naive input treated as UTC)
google.protobuf.Duration datetime.timedelta
google.protobuf.*Value wrappers T | None
google.protobuf.Struct / Value / ListValue dict[str, Any] / Any / list[Any]
google.protobuf.Any typing.Any — accepts any ProtoModel; packed/unpacked via the model registry

Field names that collide with python keywords or pydantic internals (class, from, model_config, ...) get a trailing underscore (class_) with the proto name kept as a populate alias. The same rule applies to message/enum type names and enum members that are python keywords or would shadow generated code (message listclass list_) — the proto full name stays the source of truth. Same-named messages in different packages get package-qualified class names; every model is also reachable via protodantic.model_for("pkg.Message").

Semantics worth knowing:

  • Validation on mutation is on by default (validate_assignment=True): assigning a second oneof member or an out-of-range int raises immediately. Opt out per-model with standard pydantic config on a subclass.
  • protodantic.NULL expresses an explicit JSON null in a google.protobuf.Value field (None means unset). In model_dump_json() it serializes as real null; python-mode dumps keep the sentinel.
  • Subclassing a generated model does not affect parsing: from_proto/model_for keep resolving to the generated class. To make your subclass the resolution target (e.g. to add custom validators applied on parse), re-declare __proto_full_name__ in its body — explicit opt-in.

Interop with existing _pb2 code

Already consuming a centralized proto package as protoc-generated _pb2 modules? Generated models interoperate directly:

user = User.from_proto(their_pb2_user_instance)   # accepts _pb2 messages
their_msg = their_pb2.User.FromString(user.to_proto_bytes())  # canonical bytes

How it works

protoc (bundled via grpcio-tools) compiles your protos to a FileDescriptorSet, which codegen embeds in the generated module. At runtime, ProtoModel builds dynamic protobuf message classes from those descriptors — no _pb2.py files needed, and no protobuf internals leak into your models.

If several imported generated modules define the same proto type, the registry behind model_for() / nested-message resolution is last-import-wins.

Status & roadmap

Requires Python ≥ 3.11. proto3 only by design (proto2 input is rejected with a clear error). The full supported-behavior spec lives in tests/ — every test documents one use case. Documented policies: unknown fields are dropped when a model re-serializes (the model is the source of truth), and naive datetimes are interpreted as UTC.

  • 0.1.0 (current) — greenfield: .proto → pydantic codegen with lossless bidirectional round-trips, plus the semantics future drops build on.
  • 0.2.0 — brownfield — reverse schema codegen (pydantic models → .proto), generating from installed _pb2 packages by descriptor reflection, to_proto(into=TheirPb2Class).
  • 0.3.0 — performance — benchmark suite (vs json.loads+pydantic, raw _pb2, betterproto), then cached field plans and trusted-construction fast paths.

gRPC service stubs are out of scope: protodantic is a message layer.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

protodantic_py-0.1.0.tar.gz (55.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

protodantic_py-0.1.0-py3-none-any.whl (16.4 kB view details)

Uploaded Python 3

File details

Details for the file protodantic_py-0.1.0.tar.gz.

File metadata

  • Download URL: protodantic_py-0.1.0.tar.gz
  • Upload date:
  • Size: 55.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for protodantic_py-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4d50cce75de7295100168e70bb64fd1e2f93e2301d6a9dcf40d837a6b46a1976
MD5 23b45e23c0e8f47359564e15c87d7eaf
BLAKE2b-256 968094546e55fba71d86da9d767ea25ee30929aa132bd85ec3f1d9bfa309d66d

See more details on using hashes here.

Provenance

The following attestation bundles were made for protodantic_py-0.1.0.tar.gz:

Publisher: release.yml on Koryto/protodantic

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file protodantic_py-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: protodantic_py-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for protodantic_py-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 10c10c796bd9b72b6e15a6f8bd6f32fbed545e15a9fd139fde4ad58cf1a66d9e
MD5 16eac68db92d9515beac6008e09b0480
BLAKE2b-256 0d5b12e938b11536926e5229b2777e7dbe534dc3311e41491bbee83326a48d82

See more details on using hashes here.

Provenance

The following attestation bundles were made for protodantic_py-0.1.0-py3-none-any.whl:

Publisher: release.yml on Koryto/protodantic

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page