Skip to main content

Amara3 project, which offers a variety of data processing tools. This module adds the MicroXML support, and adaptation to classic XML.

Project description

amara3-xml

A data processing library built on Python 3 and MicroXML_. This module adds the MicroXML support, and adaptation to classic XML. Requires Python 3.4+

Use

Amara is focused on MicroXML, rather than full XML. However because most of the XML-like data you’ll be dealing with is XML 1.0, Amara provides capabilities to parse legacy XML and reduce it to MicroXML. In many cases the biggest implication of this is that namespace information is stripped. As long as you know what you’re doing you can get pretty far by ignoring this, but make sure you know what you’re doing.

from amara3.uxml import xml

MONTY_XML = """<monty xmlns="urn:spam:ignored">
  <python spam="eggs">What do you mean "bleh"</python>
  <python ministry="abuse">But I was looking for argument</python>
</monty>"""

builder = xml.treebuilder()
root = builder.parse(MONTY_XML)
print(root.xml_name) #"monty"
child = next(root.xml_children)
print(child) #First text node: "

" child = next(root.xml_children) print(child.xml_value) #"What do you mean "bleh"" print(child.xml_attributes["spam"]) #"eggs"

There are some utilities to make this a bit easier as well.

from amara3.uxml import xml
from amara3.uxml.treeutil import *

MONTY_XML = """<monty xmlns="urn:spam:ignored">
  <python spam="eggs">What do you mean "bleh"</python>
  <python ministry="abuse">But I was looking for argument</python>
</monty>"""

builder = xml.treebuilder()
root = builder.parse(MONTY_XML)
py1 = next(select_name(root, "python"))
print(py1.xml_value) #"What do you mean "bleh""
py2 = next(select_attribute(root, "ministry", "abuse"))
print(py2.xml_value) #"But I was looking for argument"

Experimental MicroXML parser

For this parser the input truly must be MicroXML. Basics:

>>> from amara3.uxml.parser import parse
>>> events = parse('<hello><bold>world</bold></hello>')
>>> for ev in events: print(ev)
...
(<event.start_element: 1>, 'hello', {}, [])
(<event.start_element: 1>, 'bold', {}, ['hello'])
(<event.characters: 3>, 'world')
(<event.end_element: 2>, 'bold', ['hello'])
(<event.end_element: 2>, 'hello', [])
>>>

Or…And now for something completely different!…Incremental parsing.

>>> from amara3.uxml.parser import parsefrags
>>> events = parsefrags(['<hello', '><bold>world</bold></hello>'])
>>> for ev in events: print(ev)
...
(<event.start_element: 1>, 'hello', {}, [])
(<event.start_element: 1>, 'bold', {}, ['hello'])
(<event.characters: 3>, 'world')
(<event.end_element: 2>, 'bold

Author: Uche Ogbuji uche@ogbuji.net

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

amara3.xml-3.0.2.tar.gz (44.0 kB view details)

Uploaded Source

File details

Details for the file amara3.xml-3.0.2.tar.gz.

File metadata

  • Download URL: amara3.xml-3.0.2.tar.gz
  • Upload date:
  • Size: 44.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/39.0.1 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.7.0

File hashes

Hashes for amara3.xml-3.0.2.tar.gz
Algorithm Hash digest
SHA256 611936cf4d7f21e251b79b8d0528fd7e45ef6cd2a571cac9a3670d7bdf383511
MD5 c92bf6a55225971eddc52569de58a20d
BLAKE2b-256 f1078ef286315006f5d0ee64ecb69998dd8fab399a8ce2569a74f8891cad70af

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page