A datatrees wrapper for creating XML serializers / deserializers. Via datatrees it extends Python's dataclasses to support XML to Python class mappings.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

owebeeone

These details have not been verified by PyPI

Project description

XDataTrees

A datatrees wrapper for creating XML serializers and deserializers with automatic field mapping, type conversion, and namespace support.

XDataTrees extends Python's dataclasses through datatrees to provide a clean, declarative API for mapping between XML documents and Python objects. It eliminates the boilerplate code typically required for XML serialization/deserialization while providing type safety and comprehensive error handling.

Installation

pip install xdatatrees

Core Features

Declarative XML Mapping: Map Python classes to XML using simple annotations
Automatic Type Conversion: Built-in conversion between Python types and XML string representations
Flexible Field Types: Support for XML attributes, elements, and metadata
Namespace Support: Full XML namespace handling with automatic prefix management
Custom Converters: Extensible system for complex data type transformations
Value Collectors: Specialized handling for collections and custom aggregations
Comprehensive Error Handling: Detailed error reporting with unused element/attribute detection
Name Transformations: Automatic conversion between naming conventions (camelCase ↔ snake_case)

Why XDataTrees?

The Challenge of XML Processing

Traditional XML processing in Python often requires verbose, error-prone code:

# Without XDataTrees - manual XML processing
import xml.etree.ElementTree as ET

def parse_person_xml(xml_string):
    root = ET.fromstring(xml_string)
    person = {
        'name': root.get('name'),
        'age': int(root.get('age')),
        'address': root.find('address').text if root.find('address') is not None else None
    }
    return person

def create_person_xml(person):
    root = ET.Element('person')
    root.set('name', person['name'])
    root.set('age', str(person['age']))
    if person.get('address'):
        addr_elem = ET.SubElement(root, 'address')
        addr_elem.text = person['address']
    return ET.tostring(root, encoding='unicode')

This approach has several problems:

Manual type conversion and validation
No clear mapping between XML structure and Python objects
Error-prone attribute and element handling
No namespace support
Difficult to maintain as XML schema evolves

XDataTrees: Declarative XML Processing

XDataTrees provides a clean, maintainable pattern:

# With XDataTrees - declarative and type-safe
from xdatatrees import xdatatree, xfield, Attribute, TextElement

@xdatatree
class Person:
    name: str = xfield(ftype=Attribute, doc='Person name')
    age: int = xfield(ftype=Attribute, doc='Person age')
    address: str = xfield(ftype=TextElement, doc='Person address')

# Automatic serialization/deserialization with full type safety
# Usage shown in examples below

Benefits

Type Safety: Automatic conversion with validation
Clear Schema Definition: The class structure documents the XML schema
Maintainable: Changes to XML structure require minimal code changes
Namespace Aware: Built-in support for complex XML namespaces
Error Handling: Comprehensive validation and error reporting
Extensible: Custom converters for complex data types

Basic Usage

1. Simple XML Mapping

from xdatatrees import xdatatree, xfield, Attribute, TextElement
from typing import List

# Define field defaults for consistency
DEFAULT_CONFIG = xfield(ftype=Attribute)

@xdatatree
class Person:
    XDATATREE_CONFIG = DEFAULT_CONFIG  # Sets default for all fields
    name: str = xfield(doc='Person name')
    age: int = xfield(doc='Person age') 
    address: str = xfield(ftype=TextElement, doc='Person address')  # Override default

# This maps to XML like:
# <person name="John" age="30">
#   <address>123 Main St</address>
# </person>

2. Working with Lists and Collections

from xdatatrees import Element

@xdatatree
class People:
    XDATATREE_CONFIG = xfield(ftype=Element)  # Default to elements
    people: List[Person] = xfield(doc='List of people')
    count: int = xfield(ftype=Attribute, doc='Total count')

# Maps to:
# <people count="2">
#   <people name="John" age="30"><address>123 Main St</address></people>
#   <people name="Jane" age="25"><address>456 Oak Ave</address></people>
# </people>

3. Serialization and Deserialization

from xdatatrees import XmlSerializationSpec
import xml.etree.ElementTree as ET

# Create serialization specification
SERIALIZATION_SPEC = XmlSerializationSpec(
    People,        # Root class
    'people',      # Root element name
    xml_namespaces=None  # Optional XML namespaces
)

# Deserialize from XML
xml_string = '''
<people count="1">
    <people name="Alice" age="28">
        <address>789 Pine Rd</address>
    </people>
</people>
'''

xml_tree = ET.fromstring(xml_string)
people_obj, status = SERIALIZATION_SPEC.deserialize(xml_tree)

print(f"Count: {people_obj.count}")  # 1
print(f"First person: {people_obj.people[0].name}")  # Alice

# Serialize back to XML
xml_element = SERIALIZATION_SPEC.serialize(people_obj)
xml_string = ET.tostring(xml_element, encoding='unicode')

Field Types

XDataTrees supports four main field mapping types:

Attribute Fields

Map to XML attributes on the element:

from xdatatrees import xdatatree, xfield, Attribute

@xdatatree
class Product:
    id: str = xfield(ftype=Attribute)
    price: float = xfield(ftype=Attribute)

# <product id="ABC123" price="29.99"/>

Element Fields

Map to child XML elements containing nested xdatatree classes:

from xdatatrees import Element

@xdatatree
class Address:
    street: str = xfield(ftype=Attribute)
    city: str = xfield(ftype=Attribute)

@xdatatree
class Person:
    name: str = xfield(ftype=Attribute)
    address: Address = xfield(ftype=Element)

# <person name="John">
#   <address street="123 Main St" city="Anytown"/>
# </person>

TextElement Fields

Map to XML elements containing only simple text content (str, int, float, bool):

from xdatatrees import xdatatree, xfield, TextElement

@xdatatree
class Product:
    name: str = xfield(ftype=TextElement)
    price: float = xfield(ftype=TextElement)
    in_stock: bool = xfield(ftype=TextElement)

# <product>
#   <name>Widget</name>
#   <price>29.99</price>
#   <in_stock>true</in_stock>
# </product>

Metadata Fields

Map to special metadata elements with key-value structure:

from xdatatrees import Metadata

@xdatatree
class Product:
    category: str = xfield(ftype=Metadata)

# <product>
#   <metadata key="category" value="electronics"/>
# </product>

Advanced Features

1. Custom Converters for Complex Types

For data types that need special handling, create custom converter classes:

from xdatatrees import xdatatree, xfield, Attribute, Metadata
from datatrees import datatree, dtfield
import numpy as np
import re
from typing import Union

@datatree
class MatrixConverter:
    """Convert between space-separated string and 4x4 matrix."""
    matrix: np.ndarray = dtfield(doc='4x4 transformation matrix')

    def __init__(self, matrix_input: Union[str, np.ndarray]):
        if isinstance(matrix_input, np.ndarray):
            self.matrix = matrix_input
        else:
            # Parse "1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1" format
            values = [float(x) for x in re.split(r'\s+', matrix_input.strip())]
            if len(values) != 16:
                raise ValueError(f"Expected 16 values, got {len(values)}")
            self.matrix = np.array(values).reshape((4, 4))

    def __str__(self):
        return ' '.join(str(x) for x in self.matrix.flatten())

@xdatatree
class Transform:
    XDATATREE_CONFIG = xfield(ftype=Metadata)
    id: str = xfield(ftype=Attribute, doc='Transform ID')
    matrix: MatrixConverter = xfield(doc='Transformation matrix')

# Usage:
# <transform id="T1">
#   <metadata key="matrix" value="1 0 0 5 0 1 0 10 0 0 1 0 0 0 0 1"/>
# </transform>

2. XML Namespaces

Handle complex XML documents with multiple namespaces:

from xdatatrees import xdatatree, xfield, Attribute, XmlNamespaces, XmlSerializationSpec

# Define namespaces
NAMESPACES = XmlNamespaces(
    xmlns="http://schemas.example.com/model/2023",
    geom="http://schemas.example.com/geometry/2023",
    mat="http://schemas.example.com/materials/2023"
)

@xdatatree
class Component:
    XDATATREE_CONFIG = xfield(xmlns=NAMESPACES.xmlns, ftype=Attribute)
    
    id: str = xfield(doc='Component ID')
    geometry_ref: str = xfield(xmlns=NAMESPACES.geom, doc='Geometry reference')
    material_id: str = xfield(xmlns=NAMESPACES.mat, doc='Material ID')

# Create specification with namespaces
SERIALIZATION_SPEC = XmlSerializationSpec(
    Component,
    'component',
    xml_namespaces=NAMESPACES
)

# Handles XML like:
# <component xmlns="http://schemas.example.com/model/2023"
#           xmlns:geom="http://schemas.example.com/geometry/2023" 
#           xmlns:mat="http://schemas.example.com/materials/2023"
#           id="C001" 
#           geom:geometry_ref="G001"
#           mat:material_id="M001"/>

3. Name Transformations

Automatically convert between naming conventions:

from xdatatrees import xdatatree, xfield, Attribute, TextElement, CamelSnakeConverter, SnakeCamelConverter

@xdatatree
class APIResponse:
    XDATATREE_CONFIG = xfield(
        ename_transform=CamelSnakeConverter,    # element names: snake_case → camelCase
        aname_transform=SnakeCamelConverter,    # attribute names: snake_case → camelCase
        ftype=Attribute
    )
    
    user_id: str = xfield(doc='User identifier')        # becomes UserId attribute
    creation_date: str = xfield(ftype=TextElement, doc='Creation date')  # becomes creation_date element

# Python field: user_id → XML attribute: UserId
# Python field: creation_date → XML element: creation_date

4. Custom Value Collectors

For specialized collection handling and data aggregation:

from xdatatrees import ValueCollector, Element
from typing import List, Tuple

@xdatatree
class Triangle:
    v1: int = xfield(ftype=Attribute, doc='First vertex index')
    v2: int = xfield(ftype=Attribute, doc='Second vertex index') 
    v3: int = xfield(ftype=Attribute, doc='Third vertex index')
    color: str = xfield(ftype=Attribute, doc='Triangle color')

@datatree
class TriangleCollector(ValueCollector):
    """Collect triangles into optimized numpy arrays."""
    vertices: List[np.ndarray] = dtfield(default_factory=list, doc='Vertex indices')
    colors: List[str] = dtfield(default_factory=list, doc='Triangle colors')
    
    CONTAINED_TYPE = Triangle
    
    def append(self, triangle: Triangle):
        if not isinstance(triangle, Triangle):
            raise ValueError(f'Expected Triangle, got {type(triangle).__name__}')
        self.vertices.append(np.array([triangle.v1, triangle.v2, triangle.v3]))
        self.colors.append(triangle.color)

    def get(self) -> Tuple[np.ndarray, List[str]]:
        """Return optimized representation."""
        return np.array(self.vertices), self.colors
    
    @classmethod
    def to_contained_type(cls, data: Tuple[np.ndarray, List[str]]):
        """Convert back to Triangle objects for serialization."""
        vertices, colors = data
        for vertex_trio, color in zip(vertices, colors):
            yield Triangle(v1=vertex_trio[0], v2=vertex_trio[1], v3=vertex_trio[2], color=color)

@xdatatree  
class Mesh:
    triangles: TriangleCollector = xfield(ftype=Element, doc='Mesh triangles')

# The collector automatically handles:
# - Parsing multiple <triangle> elements into optimized arrays
# - Converting back to individual Triangle objects for serialization
# - Providing efficient access to the aggregated data

Error Handling and Validation

XDataTrees provides comprehensive error handling and validation:

1. Parser Options

Control parsing behavior and error reporting:

from xdatatrees import XmlParserOptions, XmlSerializationSpec

# Strict parsing - fail on unknown elements/attributes
strict_options = XmlParserOptions(
    assert_unused_elements=True,     # Raise error for unknown elements
    assert_unused_attributes=True,   # Raise error for unknown attributes
    print_unused_elements=True       # Print warnings for debugging
)

SERIALIZATION_SPEC = XmlSerializationSpec(
    MyClass,
    'root',
    options=strict_options
)

2. Handling Unknown XML Content

# Deserialize with error checking
model, status = SERIALIZATION_SPEC.deserialize(xml_tree)

# Check for parsing issues
if status.contains_unknown_elements:
    print("Warning: Found unknown XML elements")
    # Access details through the model objects
    
if status.contains_unknown_attributes:
    print("Warning: Found unknown XML attributes")

# Examine specific nodes for unknown content
def check_unknown_content(node):
    if hasattr(node, 'contains_unknown_elements') and node.contains_unknown_elements:
        print(f'Node {node.__class__.__name__} has unknown elements:')
        print(node.xdatatree_unused_xml_elements)
        
    if hasattr(node, 'contains_unknown_attributes') and node.contains_unknown_attributes:
        print(f'Node {node.__class__.__name__} has unknown attributes:')
        print(node.xdatatree_unused_xml_attributes)

3. Common Error Types

XDataTrees provides specific error types for different failure modes:

from xdatatrees import TooManyValuesError

try:
    model, status = SERIALIZATION_SPEC.deserialize(xml_tree)
except TooManyValuesError as e:
    print(f"Multiple values provided for single-value field: {e}")
except Exception as e:
    print(f"General parsing error: {e}")

Best Practices

Define XDATATREE_CONFIG: Set common defaults to reduce repetition and improve consistency
```
DEFAULT_CONFIG = xfield(ftype=Attribute, ename_transform=CamelSnakeConverter)
```
Use Descriptive Documentation: Document all fields for better maintainability
```
name: str = xfield(doc='User full name (first and last)')
```
Leverage Custom Converters: Create reusable converters for complex data types
Use Namespaces: Always define and use XmlNamespaces for production XML schemas
Handle Errors Gracefully: Use appropriate parser options and error handling for your use case
Type Annotations Matter: XDataTrees uses type annotations for conversion - ensure they're accurate
Validate Early: Use strict parsing during development to catch schema mismatches

Complete Example: 3D Model Serialization

Here's a comprehensive example showing XDataTrees in action:

Click to see the complete 3D model example

from xdatatrees import xdatatree, xfield, Attribute, Element, Metadata
from xdatatrees import XmlSerializationSpec, XmlNamespaces, XmlParserOptions
from datatrees import datatree, dtfield
from typing import List
import xml.etree.ElementTree as ET

# Define namespaces
NAMESPACES = XmlNamespaces(
    xmlns="http://schemas.example.com/3d/2023",
    material="http://schemas.example.com/materials/2023"
)

# Field defaults
DEFAULT_CONFIG = xfield(xmlns=NAMESPACES.xmlns, ftype=Attribute)

@xdatatree
class Vertex:
    XDATATREE_CONFIG = DEFAULT_CONFIG
    x: float = xfield(doc='X coordinate')
    y: float = xfield(doc='Y coordinate') 
    z: float = xfield(doc='Z coordinate')

@xdatatree
class Triangle:
    XDATATREE_CONFIG = DEFAULT_CONFIG
    v1: int = xfield(doc='First vertex index')
    v2: int = xfield(doc='Second vertex index')
    v3: int = xfield(doc='Third vertex index')

@xdatatree
class Material:
    XDATATREE_CONFIG = xfield(xmlns=NAMESPACES.material, ftype=Attribute)
    id: str = xfield(doc='Material identifier')
    name: str = xfield(doc='Material name')
    color: str = xfield(doc='Material color (hex)')

@xdatatree
class Mesh:
    XDATATREE_CONFIG = xfield(xmlns=NAMESPACES.xmlns, ftype=Element)
    vertices: List[Vertex] = xfield(doc='Mesh vertices')
    triangles: List[Triangle] = xfield(doc='Mesh triangles')
    material_id: str = xfield(ftype=Attribute, doc='Associated material')

@xdatatree
class Model:
    XDATATREE_CONFIG = xfield(xmlns=NAMESPACES.xmlns, ftype=Element)
    name: str = xfield(ftype=Attribute, doc='Model name')
    materials: List[Material] = xfield(doc='Available materials')
    meshes: List[Mesh] = xfield(doc='Model meshes')

# Create serialization spec
SPEC = XmlSerializationSpec(
    Model,
    'model',
    xml_namespaces=NAMESPACES,
    options=XmlParserOptions(assert_unused_elements=True)
)

# Create a model
model = Model(
    name="Simple Cube",
    materials=[
        Material(id="mat1", name="Red Plastic", color="#FF0000")
    ],
    meshes=[
        Mesh(
            material_id="mat1",
            vertices=[
                Vertex(x=0.0, y=0.0, z=0.0),
                Vertex(x=1.0, y=0.0, z=0.0),
                Vertex(x=1.0, y=1.0, z=0.0),
                Vertex(x=0.0, y=1.0, z=0.0),
            ],
            triangles=[
                Triangle(v1=0, v2=1, v3=2),
                Triangle(v1=0, v2=2, v3=3),
            ]
        )
    ]
)

# Serialize to XML
xml_element = SPEC.serialize(model)
xml_string = ET.tostring(xml_element, encoding='unicode')
print("Generated XML:")
print(xml_string)

# Deserialize back
xml_tree = ET.fromstring(xml_string)
restored_model, status = SPEC.deserialize(xml_tree)

print(f"\nRestored model: {restored_model.name}")
print(f"Material count: {len(restored_model.materials)}")
print(f"Vertex count: {len(restored_model.meshes[0].vertices)}")

Limitations

InitVar Support: Dataclasses InitVar annotations are not currently supported
Circular References: Circular object references are not handled automatically
Schema Validation: XDataTrees focuses on serialization/deserialization, not XML schema validation

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the LGPLv2.1 - see the LICENSE file for details.

Acknowledgments

XDataTrees was developed to meet anchorscad's need for robust 3MF file serialization and deserialization. It builds upon the datatrees library (also from the anchorscad project) to provide a comprehensive solution for XML processing with the full feature set of datatrees available through the xdatatree decorator.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

owebeeone

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.3

Aug 3, 2025

0.1.2

Jul 31, 2025

This version

0.1.1

Jul 28, 2025

0.1.0

Jan 5, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xdatatrees-0.1.1.tar.gz (39.6 kB view details)

Uploaded Jul 28, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

xdatatrees-0.1.1-py3-none-any.whl (27.3 kB view details)

Uploaded Jul 28, 2025 Python 3

File details

Details for the file xdatatrees-0.1.1.tar.gz.

File metadata

Download URL: xdatatrees-0.1.1.tar.gz
Upload date: Jul 28, 2025
Size: 39.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for xdatatrees-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`ad870ecb927226b1949ac88e6ba543a7a08c1153d4be74569d6cbf6f48446c40`
MD5	`d9cbb0d2b43810c0dfaebf287d394601`
BLAKE2b-256	`0796d65bf937d42f05f87dc8ca0cf81bc07c97ed81d407889e09a20e59c26cc4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for xdatatrees-0.1.1.tar.gz:

Publisher: publish.yml on owebeeone/xdatatrees

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: xdatatrees-0.1.1.tar.gz
- Subject digest: ad870ecb927226b1949ac88e6ba543a7a08c1153d4be74569d6cbf6f48446c40
- Sigstore transparency entry: 320042977
- Sigstore integration time: Jul 28, 2025
Source repository:
- Permalink: owebeeone/xdatatrees@ec519653fd77b5e8b7570a834a471b6c1486ded6
- Branch / Tag: refs/tags/v0.1.1
- Owner: https://github.com/owebeeone
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@ec519653fd77b5e8b7570a834a471b6c1486ded6
- Trigger Event: release

File details

Details for the file xdatatrees-0.1.1-py3-none-any.whl.

File metadata

Download URL: xdatatrees-0.1.1-py3-none-any.whl
Upload date: Jul 28, 2025
Size: 27.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for xdatatrees-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c6804d969f71004657d01f904a1775888fa00e9114f35a488a1b76f05bb447ca`
MD5	`8e8156c18afc8ed93d937ed794b6a15f`
BLAKE2b-256	`a9a18b78b61015446acae919e0f8fc9cb34b1b76b9c22c710ff3691bd396c66d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for xdatatrees-0.1.1-py3-none-any.whl:

Publisher: publish.yml on owebeeone/xdatatrees

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: xdatatrees-0.1.1-py3-none-any.whl
- Subject digest: c6804d969f71004657d01f904a1775888fa00e9114f35a488a1b76f05bb447ca
- Sigstore transparency entry: 320042996
- Sigstore integration time: Jul 28, 2025
Source repository:
- Permalink: owebeeone/xdatatrees@ec519653fd77b5e8b7570a834a471b6c1486ded6
- Branch / Tag: refs/tags/v0.1.1
- Owner: https://github.com/owebeeone
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@ec519653fd77b5e8b7570a834a471b6c1486ded6
- Trigger Event: release

xdatatrees 0.1.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

XDataTrees

Installation

Core Features

Why XDataTrees?

The Challenge of XML Processing

XDataTrees: Declarative XML Processing

Benefits

Basic Usage

1. Simple XML Mapping

2. Working with Lists and Collections

3. Serialization and Deserialization

Field Types

Attribute Fields

Element Fields

TextElement Fields

Metadata Fields

Advanced Features

1. Custom Converters for Complex Types

2. XML Namespaces

3. Name Transformations

4. Custom Value Collectors

Error Handling and Validation

1. Parser Options

2. Handling Unknown XML Content

3. Common Error Types

Best Practices

Complete Example: 3D Model Serialization

Limitations

Contributing

License

Acknowledgments

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance