Powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API.
lxml is a Pythonic, mature binding for the libxml2 and libxslt libraries. It provides safe and convenient access to these libraries using the ElementTree API.
It extends the ElementTree API significantly to offer support for XPath, RelaxNG, XML Schema, XSLT, C14N and much more.
In case you want to use the current in-development version of lxml, you can get it from the subversion repository at http://codespeak.net/svn/lxml/trunk . Running easy_install lxml==dev will install it from http://codespeak.net/svn/lxml/trunk#egg=lxml-dev
- Error logging in Schematron (requires libxml2 2.6.32 or later).
- Parser option strip_cdata for normalising or keeping CDATA sections. Defaults to True as before, thus replacing CDATA sections by their text content.
- CDATA() factory to wrap string content as CDATA section.
- Resolving to a filename in custom resolvers didn’t work.
- lxml did not honour libxslt’s second error state “STOPPED”, which let some XSLT errors pass silently.
- Memory leak in Schematron with libxml2 >= 2.6.31.
- lxml.etree accepted non well-formed namespace prefix names.
- Major cleanup in internal moveNodeToDocument() function, which takes care of namespace cleanup when moving elements between different namespace contexts.
- New Elements created through the makeelement() method of an HTML parser or through lxml.html now end up in a new HTML document (doctype HTML 4.01 Transitional) instead of a generic XML document. This mostly impacts the serialisation and the availability of a DTD context.