Skip to main content

XML Utility tools with Pydantic and Typer

Project description

xmlu

XML Utility - Transform XML files into type-safe Pydantic models with automatic type inference and structure detection.

Python 3.12+ License: MIT

Features

  • Automatic Type Inference: Detects str, int, and bool types from XML values
  • Smart Field Detection: Identifies required vs. optional fields based on occurrence patterns
  • Nested Model Support: Recognizes and generates nested models for XML elements with Name/Value attributes
  • Pythonic Naming: Converts XML tags to snake_case with field aliases to preserve original names
  • CLI & API: Use as a command-line tool or import as a Python library
  • Type-Safe: Generated models use Pydantic v2 for runtime validation
  • Rich Output: Beautiful terminal interface with progress information

Installation

# Using uv (recommended)
uv pip install xmlu

# Using pip
pip install xmlu

Quick Start

CLI Usage

# Generate models from XML file
xmlu generate schedule.xml --parent Event

# Specify custom output file
xmlu generate data.xml --parent Item --output models.py

# Verbose mode for detailed progress
xmlu generate schedule.xml --parent Event --verbose

# Show version
xmlu version

Python API

from xmlu import generate_pydantic_models, create_models_file

# Generate models
models = generate_pydantic_models("schedule.xml", parent_element="Event")
Event, Fields = models

# Create a models file
output_file = create_models_file("schedule.xml", list(models))
print(f"Generated: {output_file}")

# Use the generated models
event = Event(
    is_fixed=True,
    fields=Fields(duration="01:00:00", enabled=True)
)

How It Works

xmlu analyzes your XML structure and generates Pydantic models with intelligent defaults:

Input XML:

<Schedule>
    <Event>
        <EventId>1</EventId>
        <IsFixed>True</IsFixed>
        <Fields>
            <Parameter Name="Duration" Value="01:00:00" />
            <Parameter Name="Device" Value="SM-1" />
        </Fields>
    </Event>
    <Event>
        <EventId>2</EventId>
        <IsFixed>False</IsFixed>
        <Fields>
            <Parameter Name="Duration" Value="02:00:00" />
        </Fields>
    </Event>
</Schedule>

Generated Pydantic Models:

"""Auto-generated Pydantic models from Schedule.xml"""
from pydantic import BaseModel, Field, ConfigDict
from typing import Optional

class Fields(BaseModel):
    model_config = ConfigDict(
        str_strip_whitespace=True,
        validate_assignment=True,
        populate_by_name=True,
    )

    duration: str = Field(alias="Duration")
    device: Optional[str] = Field(None, alias="Device")  # Optional - not in all Events

class Event(BaseModel):
    model_config = ConfigDict(
        str_strip_whitespace=True,
        validate_assignment=True,
        populate_by_name=True,
    )

    event_id: int = Field(alias="EventId")
    is_fixed: bool = Field(alias="IsFixed")
    fields: Fields = Field(alias="Fields")

Features in Detail

Type Inference

xmlu automatically infers Python types:

  • Boolean: True, False, 0, 1bool
  • Integer: Numeric strings → int
  • String: All other values → str

Optional Fields

Fields that don't appear in every XML element are marked as Optional[T]:

# Field present in 2 out of 3 events → Optional
device: Optional[str] = None

Nested Structures

XML elements with child elements containing Name and Value attributes are automatically converted to nested models:

<Fields>
    <Parameter Name="Duration" Value="01:00:00" />
    <Parameter Name="Device" Value="SM-1" />
</Fields>

Becomes:

class Fields(BaseModel):
    duration: str
    device: str

Snake Case Conversion

XML tags are converted to Pythonic snake_case while preserving original names via field aliases:

# XML: <EventId>123</EventId>
event_id: int = Field(alias="EventId")

# Both work for parsing:
Event(event_id=123)
Event(EventId=123)

CLI Commands

generate

Generate Pydantic models from an XML file.

xmlu generate [FILE] [OPTIONS]

Options:

  • --parent, -p TEXT: XML tag to use as parent model (default: "Event")
  • --output, -o PATH: Output file path (default: auto-generated)
  • --verbose, -v: Show detailed progress information
  • --help: Show help message

Examples:

# Basic usage
xmlu generate schedule.xml

# Custom parent element
xmlu generate data.xml --parent CustomElement

# Specify output file
xmlu generate data.xml -o app/models.py

# Verbose output
xmlu generate schedule.xml --verbose

version

Show version information.

xmlu version

API Reference

generate_pydantic_models(file_path, parent_element="Event")

Generate Pydantic models from an XML file.

Parameters:

  • file_path (str): Path to the XML file
  • parent_element (str): XML tag name to use as parent model

Returns:

  • tuple[type[BaseModel], ...]: Tuple of Pydantic models (parent first, then nested models)

Example:

models = generate_pydantic_models("schedule.xml", "Event")
Event, Fields = models

create_models_file(file_path, models)

Write Pydantic models to a Python file.

Parameters:

  • file_path (str): Path to source XML file (used for naming output)
  • models (list[type[BaseModel]]): List of Pydantic models to write

Returns:

  • str: Name of the generated file

Example:

output_file = create_models_file("schedule.xml", list(models))
# Returns: "schedule_models.py"

Utility Functions

convert_to_snake_case(name)

Convert any string to snake_case.

from xmlu import convert_to_snake_case

convert_to_snake_case("CamelCase")      # "camel_case"
convert_to_snake_case("HTTPServer")     # "http_server"
convert_to_snake_case("kebab-case")     # "kebab_case"

convert_to_pascal_case(name)

Convert any string to PascalCase.

from xmlu import convert_to_pascal_case

convert_to_pascal_case("snake_case")    # "SnakeCase"
convert_to_pascal_case("HTTPServer")    # "HTTPServer"
convert_to_pascal_case("kebab-case")    # "KebabCase"

Development

Setup

# Clone the repository
git clone https://github.com/MichielMe/xmlu.git
cd xmlu

# Install dependencies with uv
uv sync

# Run tests
make test

Running Tests

# Run all tests with pytest
make test

# Or directly with uv
uv run pytest -v

Project Structure

xmlu/
├── src/
│   └── xmlu/
│       ├── __init__.py              # Public API exports
│       ├── main.py                  # CLI application
│       ├── convert_to_pydantic.py   # Core model generation
│       └── utils.py                 # String utilities & type inference
├── tests/
│   ├── test_generator.py            # Model generation tests
│   └── test_string_utils.py         # Utility function tests
├── pyproject.toml                   # Project configuration
└── README.md

Configuration

Generated models include sensible defaults:

model_config = ConfigDict(
    str_strip_whitespace=True,      # Auto-strip whitespace
    validate_assignment=True,        # Validate on field assignment
    populate_by_name=True,           # Accept both snake_case and original names
)

Requirements

  • Python 3.12+
  • lxml >= 6.0.2
  • pydantic >= 2.12.3
  • typer >= 0.20.0

License

MIT License - see LICENSE file for details.

Author

Michiel Meire

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Roadmap

  • JSON output format support
  • XML Schema (XSD) validation
  • Support for XML attributes (not just elements)
  • List/array type detection for repeated elements
  • Custom type mapping configuration
  • Watch mode for automatic regeneration

Acknowledgments

Built with:

  • Pydantic - Data validation using Python type hints
  • lxml - XML processing library
  • Typer - CLI framework
  • Rich - Terminal formatting (via Typer)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xmlu-0.1.1.tar.gz (41.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xmlu-0.1.1-py3-none-any.whl (13.8 kB view details)

Uploaded Python 3

File details

Details for the file xmlu-0.1.1.tar.gz.

File metadata

  • Download URL: xmlu-0.1.1.tar.gz
  • Upload date:
  • Size: 41.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.4

File hashes

Hashes for xmlu-0.1.1.tar.gz
Algorithm Hash digest
SHA256 941c037042ebbed2ccb20f14ed2deb3e5eb0069ae59acadbc1a556f83af0e970
MD5 89fb785bcd8483a3fc6e7f4eb039641e
BLAKE2b-256 0c7f3eff083df63c9d1cc2b820b364a2aad0fbc9e7541b913022bd8e5d035e95

See more details on using hashes here.

File details

Details for the file xmlu-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: xmlu-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 13.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.4

File hashes

Hashes for xmlu-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6de1b50a97d8e71f1aa31c7afe7553fc9ddb227a72f3a4dbd96e6277fa4fd06e
MD5 2dfc6e79e2828bd30e7c85167087e79c
BLAKE2b-256 d77212c90ea88d029c49b82989fb8b1648f64026ce502ceaef1fd3ec53a3f8c4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page