Skip to main content

Pydantic model migrations and schemas

Project description

pyrmute

ci pypi versions license

Pydantic model migrations and schema management with semantic versioning.

pyrmute handles the complexity of data model evolution so you can confidently make changes without breaking your production systems. Version your models, define transformations, and let pyrmute automatically migrate legacy data through multiple versions.

Key Features

  • Version your models - Track schema evolution with semantic versioning
  • Automatic migration chains - Transform data across multiple versions (1.0.0 → 2.0.0 → 3.0.0) in a single call
  • Type-safe transformations - Migrations return validated Pydantic models, catching errors before they reach production
  • Flexible schema export - Generate JSON schemas for all versions with support for $ref, custom generators, and nested models
  • Production-ready - Batch processing, parallel execution, and streaming support for large datasets
  • Only one dependency - Pydantic

Help

See documentation for complete guides and API reference.

Installation

pip install pyrmute

Quick Start

from pydantic import BaseModel
from pyrmute import ModelManager, ModelData

manager = ModelManager()


# Version 1: Simple user model
@manager.model("User", "1.0.0")
class UserV1(BaseModel):
    name: str
    age: int


# Version 2: Split name into components
@manager.model("User", "2.0.0")
class UserV2(BaseModel):
    first_name: str
    last_name: str
    age: int


# Version 3: Add email and make age optional
@manager.model("User", "3.0.0")
class UserV3(BaseModel):
    first_name: str
    last_name: str
    email: str
    age: int | None = None


# Define how to migrate between versions
@manager.migration("User", "1.0.0", "2.0.0")
def split_name(data: ModelData) -> ModelData:
    parts = data["name"].split(" ", 1)
    return {
        "first_name": parts[0],
        "last_name": parts[1] if len(parts) > 1 else "",
        "age": data["age"],
    }


@manager.migration("User", "2.0.0", "3.0.0")
def add_email(data: ModelData) -> ModelData:
    return {
        **data,
        "email": f"{data['first_name'].lower()}@example.com"
    }


# Migrate legacy data to the latest version
legacy_data = {"name": "John Doe", "age": 30}  # or, legacy.model_dump()
current_user = manager.migrate(legacy_data, "User", "1.0.0", "3.0.0")

print(current_user)
# UserV3(first_name='John', last_name='Doe', email='john@example.com', age=30)

Advanced Usage

Compare Model Versions

# See exactly what changed between versions
diff = manager.diff("User", "1.0.0", "3.0.0")
print(f"Added: {diff.added_fields}")
print(f"Removed: {diff.removed_fields}")
# Render a changelog to Markdown
print(diff.to_markdown(header_depth=4))

With header_depth=4 the output can be embedded nicely into this document.

User: 1.0.0 → 3.0.0

Added Fields
  • email: str (required)
  • first_name: str (required)
  • last_name: str (required)
Removed Fields
  • name
Modified Fields
  • age - type: intint | None - now optional - default added: None
Breaking Changes
  • ⚠️ New required field 'last_name' will fail for existing data without defaults
  • ⚠️ New required field 'first_name' will fail for existing data without defaults
  • ⚠️ New required field 'email' will fail for existing data without defaults
  • ⚠️ Removed fields 'name' will be lost during migration
  • ⚠️ Field 'age' type changed - may cause validation errors

Batch Processing

# Migrate thousands of records efficiently
legacy_users = [
    {"name": "Alice Smith", "age": 28},
    {"name": "Bob Johnson", "age": 35},
    # ... thousands more
]

# Parallel processing for CPU-intensive migrations
users = manager.migrate_batch(
    legacy_users,
    "User",
    from_version="1.0.0",
    to_version="3.0.0",
    parallel=True,
    max_workers=4,
)

Streaming Large Datasets

# Process huge datasets without loading everything into memory
def load_users_from_database() -> Iterator[dict[str, Any]]:
    yield from database.stream_users()


# Migrate and save incrementally
for user in manager.migrate_batch_streaming(
    load_users_from_database(),
    "User",
    from_version="1.0.0",
    to_version="3.0.0",
    chunk_size=1000
):
    database.save(user)

Test Your Migrations

# Validate migration logic with test cases
results = manager.test_migration(
    "User",
    from_version="1.0.0",
    to_version="2.0.0",
    test_cases=[
        # (input, expected_output)
        (
            {"name": "Alice Smith", "age": 28},
            {"first_name": "Alice", "last_name": "Smith", "age": 28}
        ),
        (
            {"name": "Bob", "age": 35},
            {"first_name": "Bob", "last_name": "", "age": 35}
        ),
    ]
)

# Use in your test suite
assert results.all_passed, f"Migration failed: {results.failures}"

Export JSON Schemas

# Generate schemas for all versions
manager.dump_schemas("schemas/")
# Creates: User_v1.0.0.json, User_v2.0.0.json, User_v3.0.0.json

# Use separate files with $ref for nested models with 'enable_ref=True'.
manager.dump_schemas(
    "schemas/",
    separate_definitions=True,
    ref_template="https://api.example.com/schemas/{model}_v{version}.json"
)

Auto-Migration

# Skip writing migration functions for simple changes
@manager.model("Config", "1.0.0")
class ConfigV1(BaseModel):
    timeout: int = 30


@manager.model("Config", "2.0.0", backward_compatible=True)
class ConfigV2(BaseModel):
    timeout: int = 30
    retries: int = 3  # New field with default


# No migration function needed - defaults are applied automatically
config = manager.migrate({"timeout": 60}, "Config", "1.0.0", "2.0.0")
# ConfigV2(timeout=60, retries=3)

Real-World Example

from datetime import datetime
from pydantic import BaseModel, EmailStr
from pyrmute import ModelManager, ModelData

manager = ModelManager()


# API v1: Basic order
@manager.model("Order", "1.0.0")
class OrderV1(BaseModel):
    id: str
    items: list[str]
    total: float


# API v2: Add customer info
@manager.model("Order", "2.0.0")
class OrderV2(BaseModel):
    id: str
    items: list[str]
    total: float
    customer_email: EmailStr


# API v3: Structured items and timestamps
@manager.model("Order", "3.0.0")
class OrderItemV3(BaseModel):
    product_id: str
    quantity: int
    price: float


@manager.model("Order", "3.0.0")
class OrderV3(BaseModel):
    id: str
    items: list[OrderItemV3]
    total: float
    customer_email: EmailStr
    created_at: datetime


# Define migrations
@manager.migration("Order", "1.0.0", "2.0.0")
def add_customer_email(data: ModelData) -> ModelData:
    return {**data, "customer_email": "customer@example.com"}


@manager.migration("Order", "2.0.0", "3.0.0")
def structure_items(data: ModelData) -> ModelData:
    # Convert simple strings to structured items
    structured_items = [
        {
            "product_id": item,
            "quantity": 1,
            "price": data["total"] / len(data["items"])
        }
        for item in data["items"]
    ]
    return {
        **data,
        "items": structured_items,
        "created_at": datetime.now().isoformat()
    }

# Migrate old orders from your database
old_order = {"id": "123", "items": ["widget", "gadget"], "total": 29.99}
new_order = manager.migrate(old_order, "Order", "1.0.0", "3.0.0")
database.save(new_order)

Contributing

For guidance on setting up a development environment and how to make a contribution to pyrmute, see Contributing to pyrmute.

Reporting a Security Vulnerability

See our security policy.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyrmute-0.3.0.tar.gz (82.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyrmute-0.3.0-py3-none-any.whl (25.0 kB view details)

Uploaded Python 3

File details

Details for the file pyrmute-0.3.0.tar.gz.

File metadata

  • Download URL: pyrmute-0.3.0.tar.gz
  • Upload date:
  • Size: 82.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pyrmute-0.3.0.tar.gz
Algorithm Hash digest
SHA256 1ec4d4b027bd71e14c98521267861e08f1f3ad4032b7f149c4ade6bdd68714ff
MD5 e5c66f064aa7ac1b9fc365351c165bb0
BLAKE2b-256 8d2cd8689783c70b0b253ec06929a153e0ec5c52fcf377274a3a4cf0ae3c7480

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyrmute-0.3.0.tar.gz:

Publisher: publish.yml on mferrera/pyrmute

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pyrmute-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: pyrmute-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 25.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pyrmute-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4a2e74e6a5f2c68a3891d003bd69a2797936147a66b6dcd5c071e0a9a13f7d98
MD5 caa0a1317a50c1d1c8e6fe04df8bdf1e
BLAKE2b-256 e9787651c07fe2948194b8acf259d55a10bd78c2e8369e7d55939392e2badf57

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyrmute-0.3.0-py3-none-any.whl:

Publisher: publish.yml on mferrera/pyrmute

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page