Skip to main content

Create LXML Elements as dataclasses representations

Project description

LXML-DATACLASS

Utility to mirror Python Classes with lxml.etree Elements using class annotations and field descriptors.

⚠️ Warning: this package was created to help the owner with big xml file description and manipulation.
Use under your own risk. 
🆘 Help wanted. If you are insterested in continue the development of this package, send me a PM.

Installation

pip install lxml-dataclass
If you are having som issues with lxml version and binary execution. Try installing with --no-binary option.

Basic usage

Lxml-dataclass uses big part of dataclasses implementation with some modifications that allow the implemmentation of two utility methods to the classes that inherit from Element base class to_lxml_element and from_lxml_element. Using the element_field function allows the metaclass to keep tracking of many lxml.etree attributes.

Lets start with some basic examples:

from lxml_dataclass import Element, element_field


class Author(Element):
    __tag__ = 'Author'
    
    name: str = element_field('Name')
    last: str = element_field('LastName', default='Doe')

    def fullname(self) -> str:
        return f"{self.name} {self.last}"

author = Author('John')
author.fullname()
# John Doe

As you can see in the given example we defined a class called Author with the attributes name and last as you can see these attributes where defined with their respectives types annotations and the field descriptor function element_field this functions accepts almost the same arguments the original dataclass field function uses except you must give as first argument the tag name the given attribute. The init method is automatically generated and you can overload it.

In the example above the tag for name will be Name and for last will be LastName

Now we will call the to_lxml_element() method to obtain the lxml.etree.Element representing this object

import lxml.etree as ET

author_element = author.to_lxml_element() 
ET.tostring(author_element)
# b'<Author><Name>John</Name><LastName>Doe</LastName></Author>'

The Element class also includes another utility method called to_string_element() which calls the ET.tostring() function on the representing element and accepts the same keyword arguments.

author.to_string_element(pretty_print=True).decode('utf-8')
#<Author>
#  <Name>John</Name>
#  <LastName>Doe</LastName>
#</Author>

Attribs and Nsmap

As you know all lxml Element accetps the attrib and nsmap arguments. For Element classes you will define those attributes on __attrib__ and __nsmap__ class or instance attributes and for class attributes you will define them on the element_field method with the attrib and nsmap key word arguments.

Lets add some attributes:

class Author(Element):
    __tag__ = 'Author'
    __attrib__ = {'ID': 'Test ID'}
    
    name: str = element_field('Name', nsmap={None: 'suffix'})
    last: str = element_field('LastName', default='Doe')

Before we get the element string we will change the ID attribute.

author.__attrib__['ID'] = 'Changed ID'
author.to_string_element(pretty_print=True).decode('utf-8')
#<Author ID="Changed ID">
#  <Name xmlns="suffix">John</Name>
#  <LastName>Doe</LastName>
#</Author>

Class inheritance and composition

Inheritance functions exactly the same as dataclasses so you can inherit from other Element classes within the rules of dataclass and init generation.

Lets create an Element mixin that will add an ID to every instance who inherit it.

from uuid import UUID, uuid4

class HasIDMixin(Element):

    id: UUID = element_field('Id', default_factory=uuid4)


class Author(HasIdMixin, Element):
    __tag__ = 'Author'
    
    name: str = element_field('Name')
    last: str = element_field('LastName', default='Doe')
author = Author('John', 'Mayer')
author.to_string_element(pretty_print=True).decode('utf-8')
#<Author>
#  <Name>John</Name>
#  <LastName>Mayer</LastName>
#  <Id>a6ff6e02-eeb7-4ca5-8e4d-efedc40ee9ae</Id>
#</Author>

Now lets create a book element that will be a child of author

class Book(Element):
    __tag__ = 'Book'

    name: str = element_field('Name')
    pages: int = element_field('Pages', default=50)


class Author(Element):
    __tag__ = 'Author'
    
    name: str = element_field('Name')
    last: str = element_field('LastName', default='Doe')
    book: Book | None = element_field('book', default=None) # Notice the tag is the same as the attribute name
author = Author('John')
book = Book('My cool book', 80)
author.book = book
author.to_string_element(pretty_print=True).decode('utf-8')
#<Author>
#  <Name>John</Name>
#  <LastName>Doe</LastName>
#  <Book>
#    <Name>My cool book</Name>
#    <Pages>80</Pages>
#  </Book>
#</Author>

Now what if we want an Author to have multiple Books. Then we stablish the is_iterable special keyword on element_field function as follows

class Author(Element):
    __tag__ = 'Author'
    
    name: str = element_field('Name')
    last: str = element_field('LastName', default='Doe')
    books: list[Book] = element_field('books', default_factory=list, is_iterable=True) # Notice the tag is the same as the attribute name

Now you could do

author = Author('John')
book_a = Book('My cool book A', 80)
book_b = Book('My cool book B')
author.books.append(book_a)
author.books.append(book_b)
author.to_string_element(pretty_print=True).decode('utf-8')
#<Author>
#  <Name>John</Name>
#  <LastName>Doe</LastName>
#  <Book>
#    <Name>My cool book A</Name>
#    <Pages>80</Pages>
#  </Book>
#  <Book>
#    <Name>My cool book B</Name>
#    <Pages>50</Pages>
#  </Book>
#</Author>

Now you can create really complex xml-class-representations with a simple and known dataclass approach, you can overload the utility methods to suit your needs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lxml_dataclass-1.0.6.tar.gz (12.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lxml_dataclass-1.0.6-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file lxml_dataclass-1.0.6.tar.gz.

File metadata

  • Download URL: lxml_dataclass-1.0.6.tar.gz
  • Upload date:
  • Size: 12.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.2

File hashes

Hashes for lxml_dataclass-1.0.6.tar.gz
Algorithm Hash digest
SHA256 e38633c083deeb540c5b77e2634ea0ea4972a0ce6916ac0d6c07bdd005c433bb
MD5 c69d924a9fc31da880b6a825be4a4ae4
BLAKE2b-256 23d831d764fb2281bfb49553456d2043a6030a11e848e6ee0b47a3288f00c3a8

See more details on using hashes here.

File details

Details for the file lxml_dataclass-1.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for lxml_dataclass-1.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 4f6e65977b8974cc84325c3f5182b5b0b77255d6fb77930e7c350904c2ad3d65
MD5 8831a3c0c7fae49640e1f52e8b5da562
BLAKE2b-256 3d0d4d0ff918ec97850a7c49f1028a671fb9efcdb6b2be23d5ee4e91e6acafd5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page