Powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API.
lxml is a Pythonic, mature binding for the libxml2 and libxslt libraries. It provides safe and convenient access to these libraries using the ElementTree API.
It extends the ElementTree API significantly to offer support for XPath, RelaxNG, XML Schema, XSLT, C14N and much more.
In case you want to use the current in-development version of lxml, you can get it from the subversion repository at http://codespeak.net/svn/lxml/trunk . Running easy_install lxml==dev will install it from http://codespeak.net/svn/lxml/trunk#egg=lxml-dev
Current bug fixes for the stable version are at http://codespeak.net/svn/lxml/branch/lxml-1.3 . Running easy_install lxml==1.3bugfix will install this version from http://codespeak.net/svn/lxml/branch/lxml-1.3#egg=lxml-1.3bugfix
- Module lxml.pyclasslookup implemens an Element class lookup scheme that can access the entire tree to determine a suitable Element class
- Parsers take a remove_comments keyword argument that skips over comments
- parse() function in objectify, corresponding to XML() etc.
- Element.addnext(el) and Element.addprevious(el) methods to support adding processing instructions and comments around the root node
- Extended type annotation in objectify: cleaner annotation namespace setup plus new deannotate() function
- Support for custom Element class instantiation in lxml.sax: passing a makeelement() function to the ElementTreeContentHandler will reuse the lookup context of that function
- ‘.’ represents empty ObjectPath (identity)
- Removing Elements from a tree could make them loose their namespace declarations
- ElementInclude didn’t honour base URL of original document
- Replacing the children slice of an Element would cut off the tails of the original children
- Element.getiterator(tag) did not accept Comment and ProcessingInstruction as tags
- API functions now check incoming strings for XML conformity. Zero bytes or low ASCII characters are no longer accepted.
- XSLT parsing failed to pass resolver context on to imported documents
- More ET compatible behaviour when writing out XML declarations or not
- Element.attrib was missing clear() and pop() methods
- More robust error handling in iterparse()
- Documents lost their top-level PIs and comments on serialisation
- lxml.sax failed on comments and PIs. Comments are now properly ignored and PIs are copied.
- Raise AssertionError when passing strings containing ‘0’ bytes