Skip to main content

Powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API.

Project description

lxml is a Pythonic, mature binding for the libxml2 and libxslt libraries. It provides safe and convenient access to these libraries using the ElementTree API.

It extends the ElementTree API significantly to offer support for XPath, RelaxNG, XML Schema, XSLT, C14N and much more.

To contact the project, go to the project home page or see our bug tracker at

In case you want to use the current in-development version of lxml, you can get it from the subversion repository at . Running easy_install lxml==dev will install it from

2.1beta3 (2008-06-19)

Features added

  • Major overhaul of tools/ script.

  • Pickling ElementTree objects in lxml.objectify.

  • Support for parsing from file-like objects that return unicode strings.

  • New function etree.cleanup_namespaces(el) that removes unused namespace declarations from a (sub)tree (experimental).

  • XSLT results support the buffer protocol in Python 3.

  • Polymorphic functions in lxml.html that accept either a tree or a parsable string will return either a UTF-8 encoded byte string, a unicode string or a tree, based on the type of the input. Previously, the result was always a byte string or a tree.

  • Support for Python 2.6 and 3.0 beta.

  • File name handling now uses a heuristic to convert between byte strings (usually filenames) and unicode strings (usually URLs).

  • Parsing from a plain file object frees the GIL under Python 2.x.

  • Running iterparse() on a plain file (or filename) frees the GIL on reading under Python 2.x.

  • Conversion functions html_to_xhtml() and xhtml_to_html() in lxml.html (experimental).

  • Most features in lxml.html work for XHTML namespaced tag names (experimental).

Bugs fixed

  • ElementTree.parse() didn’t handle target parser result.

  • Crash in Element class lookup classes when the __init__() method of the super class is not called from Python subclasses.

  • A number of problems related to unicode/byte string conversion of filenames and error messages were fixed.

  • Building on MacOS-X now passes the “flat_namespace” option to the C compiler, which reportedly prevents build quirks and crashes on this platform.

  • Windows build was broken.

  • Rare crash when serialising to a file object with certain encodings.

Other changes

  • Non-ASCII characters in attribute values are no longer escaped on serialisation.

  • Passing non-ASCII byte strings or invalid unicode strings as .tag, namespaces, etc. will result in a ValueError instead of an AssertionError (just like the tag well-formedness check).

  • Up to several times faster attribute access (i.e. tree traversal) in lxml.objectify.

Project details

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lxml-2.1beta3.tar.gz (2.5 MB view hashes)

Uploaded source

Built Distributions

lxml-2.1beta3.win32-py2.5.exe (2.4 MB view hashes)

Uploaded 2 5

lxml-2.1beta3.win32-py2.4.exe (2.4 MB view hashes)

Uploaded 2 4

lxml-2.1beta3-py2.5-win32.egg (2.4 MB view hashes)

Uploaded 2 5

lxml-2.1beta3-py2.4-win32.egg (2.4 MB view hashes)

Uploaded 2 4

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page