Powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API.
lxml is a Pythonic, mature binding for the libxml2 and libxslt libraries. It provides safe and convenient access to these libraries using the ElementTree API.
It extends the ElementTree API significantly to offer support for XPath, RelaxNG, XML Schema, XSLT, C14N and much more.
In case you want to use the current in-development version of lxml, you can get it from the github repository at https://github.com/lxml/lxml . Note that this requires Cython to build the sources, see the build instructions on the project home page. To the same end, running easy_install lxml==dev will install lxml from https://github.com/lxml/lxml/tarball/master#egg=lxml-dev if you have an appropriate version of Cython installed.
- New option handle_failures in make_links_absolute() and resolve_base_href() (lxml.html) that enables ignoring or discarding links that fail to parse as URLs.
- New parser classes XMLPullParser and HTMLPullParser for incremental parsing, as implemented for ElementTree in Python 3.4.
- LP#1255132: crash when trying to run validation over non-Element (e.g. comment or PI).
- Error messages in the log and in exception messages that originated from libxml2 could accidentally be picked up from preceding warnings instead of the actual error.
- The ElementMaker in lxml.objectify did not accept a dict as argument for adding attributes to the element it’s building. This works as in lxml.builder now.
- LP#1228881: repr(XSLTAccessControl) failed in Python 3.
- Raise ValueError when trying to append an Element to itself or to one of its own descendants, instead of running into an infinite loop.
- LP#1206077: htmldiff discarded whitespace from the output.
- Compressed plain-text serialisation to file-like objects was broken.
- lxml.html.formfill: Fix textarea form filling. The textarea used to be cleared before the new content was set, which removed the name attribute.
- Some basic API classes use freelists internally for faster instantiation. This can speed up some iterparse() scenarios, for example.
- iterparse() was rewritten to use the new *PullParser classes internally.