Powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API.
lxml is a Pythonic, mature binding for the libxml2 and libxslt libraries. It provides safe and convenient access to these libraries using the ElementTree API.
It extends the ElementTree API significantly to offer support for XPath, RelaxNG, XML Schema, XSLT, C14N and much more.
In case you want to use the current in-development version of lxml, you can get it from the github repository at https://github.com/lxml/lxml . Note that this requires Cython to build the sources, see the build instructions on the project home page. To the same end, running easy_install lxml==dev will install lxml from https://github.com/lxml/lxml/tarball/master#egg=lxml-dev if you have an appropriate version of Cython installed.
After an official release of a new stable series, bug fixes may become available at https://github.com/lxml/lxml/tree/lxml-3.4 . Running easy_install lxml==3.4bugfix will install the unreleased branch state from https://github.com/lxml/lxml/tarball/lxml-3.4#egg=lxml-3.4bugfix as soon as a maintenance branch has been established. Note that this requires Cython to be installed at an appropriate version for the build.
- xmlfile(buffered=False) disables output buffering and flushes the content after each API operation (starting/ending element blocks or writes). A new method xf.flush() can alternatively be used to explicitly flush the output.
- lxml.html.document_fromstring has a new option ensure_head_body=True which will add an empty head and/or body element to the result document if missing.
- lxml.html.iterlinks now returns links inside meta refresh tags.
- New XMLParser option collect_ids=False to disable ID hash table creation. This can substantially speed up parsing of documents with many different IDs that are not used.
- The parser uses per-document hash tables for XML IDs. This reduces the load of the global parser dict and speeds up parsing for documents with many different IDs.
- ElementTree.getelementpath(element) returns a structural ElementPath expression for the given element, which can be used for lookups later.
- xmlfile() accepts a new argument close=True to close file(-like) objects after writing to them. Before, xmlfile() only closed the file if it had opened it internally.
- Allow “bytearray” type for ASCII text input.
- LP#400588: decoding errors have become hard errors even in recovery mode. Previously, they could lead to an internal tree representation in a mixed encoding state, which lead to very late errors or even silently incorrect behaviour during tree traversal or serialisation.
- Requires Python 2.6, 2.7, 3.2 or later. No longer supports Python 2.4, 2.5 and 3.1, use lxml 3.3.x for those.
- Requires libxml2 2.7.0 or later and libxslt 1.1.23 or later, use lxml 3.3.x with older versions.
Release history Release notifications
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.