Skip to main content

Powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API.

Project description

lxml is a Pythonic, mature binding for the libxml2 and libxslt libraries. It provides safe and convenient access to these libraries using the ElementTree API.

It extends the ElementTree API significantly to offer support for XPath, RelaxNG, XML Schema, XSLT, C14N and much more.

To contact the project, go to the project home page or see our bug tracker at https://launchpad.net/lxml

In case you want to use the current in-development version of lxml, you can get it from the subversion repository at http://codespeak.net/svn/lxml/trunk . Running easy_install lxml==dev will install it from http://codespeak.net/svn/lxml/trunk#egg=lxml-dev

2.0alpha1 (2007-09-02)

Features added

  • Reimplemented objectify.E for better performance and improved integration with objectify. Provides extended type support based on registered PyTypes.
  • XSLT objects now support deep copying
  • New makeSubElement() C-API function that allows creating a new subelement straight with text, tail and attributes.
  • XPath extension functions can now access the current context node (context.context_node) and use a context dictionary (context.eval_context) from the context provided in their first parameter
  • HTML tag soup parser based on BeautifulSoup in lxml.html.ElementSoup
  • New module lxml.doctestcompare by Ian Bicking for writing simplified doctests based on XML/HTML output. Use by importing lxml.usedoctest or lxml.html.usedoctest from within a doctest.
  • New module lxml.cssselect by Ian Bicking for selecting Elements with CSS selectors.
  • New package lxml.html written by Ian Bicking for advanced HTML treatment.
  • Namespace class setup is now local to the ElementNamespaceClassLookup instance and no longer global.
  • Schematron validation (incomplete in libxml2)
  • Additional stringify argument to objectify.PyType() takes a conversion function to strings to support setting text values from arbitrary types.
  • Entity support through an Entity factory and element classes. XML parsers now have a resolve_entities keyword argument that can be set to False to keep entities in the document.
  • column field on error log entries to accompany the line field
  • Error specific messages in XPath parsing and evaluation NOTE: for evaluation errors, you will now get an XPathEvalError instead of an XPathSyntaxError. To catch both, you can except on XPathError
  • The regular expression functions in XPath now support passing a node-set instead of a string
  • Extended type annotation in objectify: new xsiannotate() function
  • EXSLT RegExp support in standard XPath (not only XSLT)

Bugs fixed

  • lxml.etree did not check tag/attribute names
  • The XML parser did not report undefined entities as error
  • The text in exceptions raised by XML parsers, validators and XPath evaluators now reports the first error that occurred instead of the last
  • Passing ‘’ as XPath namespace prefix did not raise an error
  • Thread safety in XPath evaluators

Other changes

  • objectify.PyType for None is now called “NoneType”
  • el.getiterator() renamed to el.iter(), following ElementTree 1.3 - original name is still available as alias
  • In the public C-API, findOrBuildNodeNs() was replaced by the more generic findOrBuildNodeNsPrefix
  • Major refactoring in XPath/XSLT extension function code
  • Network access in parsers disabled by default

Project details


Release history Release notifications

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
lxml-2.0alpha1.tar.gz (1.9 MB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page