Skip to main content

Powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API.

Project description

lxml is a Pythonic binding for the libxml2 and libxslt libraries. It provides safe and convenient access to these libraries using the ElementTree API.

It extends the ElementTree API significantly to offer support for XPath, RelaxNG, XML Schema, XSLT, C14N and much more.

1.0.beta (2006-05-18)

Features added

  • Formatted output via pretty_print keyword to serialization functions

  • XSLT can block access to file system and network via XSLTAccessControl

  • ElementTree.write() no longer serializes in memory (reduced memory footprint)

  • Speedup of Element.findall(tag) and Element.getiterator(tag)

  • Support for writing the XML representation of Elements and ElementTrees to Python unicode strings via etree.tounicode()

  • Support for writing XSLT results to Python unicode strings via unicode()

  • Parsing a unicode string no longer copies the string (reduced memory footprint)

  • Parsing file-like objects now reads chunks rather than the whole file (reduced memory footprint)

  • Parsing StringIO objects from the start avoids copying the string (reduced memory footprint)

  • Read-only ‘docinfo’ attribute in ElementTree class holds DOCTYPE information, original encoding and XML version as seen by the parser

  • etree module can be compiled without libxslt by commenting out the line include "xslt.pxi" near the end of the etree.pyx source file

  • Better error messages in parser exceptions

  • Error reporting now also works in XSLT

  • Support for custom document loaders (URI resolvers) in parsers and XSLT, resolvers are registered at parser level

  • Implementation of exslt:regexp for XSLT based on the Python ‘re’ module, enabled by default, can be switched off with ‘regexp=False’ keyword argument

  • Support for exslt extensions (libexslt) and libxslt extra functions (node-set, document, write, output)

  • Substantial speedup in XPath.evaluate()

  • HTMLParser for parsing (broken) HTML

  • XMLDTDID function parses XML into tuple (root node, ID dict) based on xml:id implementation of libxml2 (as opposed to ET compatible XMLID)

Bugs fixed

  • Some ElementTree methods could crash if the root node was not initialized (neither file nor element passed to the constructor)

  • Element/SubElement failed to set attribute namespaces from passed attrib dictionary

  • tostring() now adds an XML declaration for non-ASCII encodings

  • tostring() failed to serialize encodings that contain 0-bytes

  • ElementTree.xpath() and XPathDocumentEvaluator were not using the ElementTree root node as reference point

  • Calling document('') in XSLT failed to return the stylesheet

Project details

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lxml-1.0.beta.tar.gz (354.2 kB view hashes)

Uploaded source

Built Distributions

lxml-1.0.beta.win32-py2.4.exe (216.1 kB view hashes)

Uploaded 2 4

lxml-1.0.beta-py2.4-win32.egg (156.3 kB view hashes)

Uploaded 2 4

lxml-1.0.beta-py2.4-linux-x86_64.egg (200.7 kB view hashes)

Uploaded 2 4

lxml-1.0.beta-py2.4-linux-i686.egg (172.4 kB view hashes)

Uploaded 2 4

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page