Skip to main content

Amara3 project, which offers a variety of data processing tools. This module adds the MicroXML support, and adaptation to classic XML.

Project description

A data processing library built on Python 3 and MicroXML. This module adds the MicroXML support, and adaptation to classic XML.

Uche Ogbuji < uche@ogbuji.net > More discussion, etc: https://groups.google.com/forum/#!forum/akara

Install

Requires:

For the latter 2, you can do:

pip install pytest “amara3-iri>=3.0.0a2”

Use

Amara in version 3.0 is focused on MicroXML, rather than full XML. However because most of the XML-like data you’ll be dealing with is XML 1.0, Amara provides capabilities to parse legacy XML and reduce it to MicroXML. In many cases the biggest implication of this is that namespace information is stripped. As long as you know what you’re doing you can get pretty far by ignoring this, but make sure you know what you’re doing.

  from amara3.uxml import xml

  MONTY_XML = """<monty xmlns="urn:spam:ignored">
    <python spam="eggs">What do you mean "bleh"</python>
    <python ministry="abuse">But I was looking for argument</python>
  </monty>"""

  builder = xml.treebuilder()
  root = builder.parse(MONTY_XML)
  print(root.xml_name) #"monty"
  child = next(root.xml_children)
  print(child) #First text node: "
"
  child = next(root.xml_children)
  print(child.xml_value) #"What do you mean "bleh""
  print(child.xml_attributes["spam"]) #"eggs"

There are some utilities to make this a bit easier as well.

from amara3.uxml import xml
from amara3.uxml.treeutil import *

MONTY_XML = """<monty xmlns="urn:spam:ignored">
  <python spam="eggs">What do you mean "bleh"</python>
  <python ministry="abuse">But I was looking for argument</python>
</monty>"""

builder = xml.treebuilder()
root = builder.parse(MONTY_XML)
py1 = next(select_name(root, "python"))
print(py1.xml_value) #"What do you mean "bleh""
py2 = next(select_attribute(root, "ministry", "abuse"))
print(py2.xml_value) #"But I was looking for argument"

Experimental MicroXML parser

For this parser the input truly must be MicroXML. Basics:

>>> from amara3.uxml.parser import parse
>>> events = parse('<hello><bold>world</bold></hello>')
>>> for ev in events: print(ev)
...
(<event.start_element: 1>, 'hello', {}, [])
(<event.start_element: 1>, 'bold', {}, ['hello'])
(<event.characters: 3>, 'world')
(<event.end_element: 2>, 'bold', ['hello'])
(<event.end_element: 2>, 'hello', [])
>>>

Or…And now for something completely different!…Incremental parsing.

>>> from amara3.uxml.parser import parsefrags
>>> events = parsefrags(['<hello', '><bold>world</bold></hello>'])
>>> for ev in events: print(ev)
...
(<event.start_element: 1>, 'hello', {}, [])
(<event.start_element: 1>, 'bold', {}, ['hello'])
(<event.characters: 3>, 'world')
(<event.end_element: 2>, 'bold

Project details


Release history Release notifications

This version
History Node

3.0.0b5

History Node

3.0.0b4

History Node

3.0.0b3

History Node

3.0.0b2

History Node

3.0.0b1

History Node

3.0.0a9

History Node

3.0.0a8

History Node

3.0.0a7

History Node

3.0.0a6

History Node

3.0.0a5

History Node

3.0.0a4

History Node

3.0.0a3

History Node

3.0.0a2

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
amara3-xml-3.0.0b5.tar.gz (43.2 kB) Copy SHA256 hash SHA256 Source None Mar 26, 2018

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page