Skip to main content

Powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API.

Project description

lxml is a Pythonic, mature binding for the libxml2 and libxslt libraries. It provides safe and convenient access to these libraries using the ElementTree API.

It extends the ElementTree API significantly to offer support for XPath, RelaxNG, XML Schema, XSLT, C14N and much more.

To contact the project, go to the project home page or see our bug tracker at

In case you want to use the current in-development version of lxml, you can get it from the subversion repository at . Running easy_install lxml==dev will install it from

2.0alpha5 (2007-11-24)

Features added

  • Rich comparison of element.attrib proxies.

  • ElementTree compatible TreeBuilder class.

  • Use default prefixes for some common XML namespaces.

  • lxml.html.clean.Cleaner now allows for a host_whitelist, and two overridable methods: allow_embedded_url(el, url) and the more general allow_element(el).

  • Extended slicing of Elements as in element[1:-1:2], both in etree and in objectify

  • Resolvers can now provide a base_url keyword argument when resolving a document as string data.

  • When using lxml.doctestcompare you can give the doctest option NOPARSE_MARKUP (like # doctest: +NOPARSE_MARKUP) to suppress the special checking for one test.

Bugs fixed

  • Target parser failed to report comments.

  • In the lxml.html iter_links method, links in <object> tags weren’t recognized. (Note: plugin-specific link parameters still aren’t recognized.) Also, the <embed> tag, though not standard, is now included in lxml.html.defs.special_inline_tags.

  • Using custom resolvers on XSLT stylesheets parsed from a string could request ill-formed URLs.

  • With lxml.doctestcompare if you do <tag xmlns="..."> in your output, it will then be namespace-neutral (before the ellipsis was treated as a real namespace).

Other changes

  • The module source files were renamed to “lxml.*.pyx”, such as “lxml.etree.pyx”. This was changed for consistency with the way Pyrex commonly handles package imports. The main effect is that classes now know about their fully qualified class name, including the package name of their module.

  • Keyword-only arguments in some API functions, especially in the parsers and serialisers.

Project details

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lxml-2.0alpha5.tar.gz (2.2 MB view hashes)

Uploaded source

Built Distributions

lxml-2.0alpha5.win32-py2.5.exe (3.2 MB view hashes)

Uploaded 2 5

lxml-2.0alpha5.win32-py2.4.exe (3.2 MB view hashes)

Uploaded 2 4

lxml-2.0alpha5-py2.5-win32.egg (3.2 MB view hashes)

Uploaded 2 5

lxml-2.0alpha5-py2.4-win32.egg (3.2 MB view hashes)

Uploaded 2 4

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page