Skip to main content
This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (
Help us improve Python packaging - Donate today!

Python wrapper for HTML Tidy (tidylib), compatible with Python 2 and 3

Project Description

0.2.0: Works on Windows! See documentation for available DLL download locations. Documentation rewritten and expanded.

PyTidyLib is a Python package that wraps the HTML Tidy library. This allows you, from Python code, to “fix” invalid (X)HTML markup. Some of the library’s many capabilities include:

  • Clean up unclosed tags and unescaped characters such as ampersands
  • Output HTML 4 or XHTML, strict or transitional, and add missing doctypes
  • Convert named entities to numeric entities, which can then be used in XML documents without an HTML doctype.
  • Clean up HTML from programs such as Word (to an extent)
  • Indent the output, including proper (i.e. no) indenting for pre elements, which some (X)HTML indenting code overlooks.

Small example of use

The following code cleans up an invalid HTML document and sets an option:

from tidylib import tidy_document
document, errors = tidy_document('''<p>f&otilde;o <img src="bar.jpg">''',
print document
print errors


Documentation is shipped with the source distribution and is available at the PyTidyLib web page.

Release History

This version
History Node


Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, Size & Hash SHA256 Hash Help File Type Python Version Upload Date
(155.1 kB) Copy SHA256 Hash SHA256
Source None Dec 9, 2013

Supported By

Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Kabu Creative Kabu Creative UX & Design Google Google Cloud Servers Fastly Fastly CDN StatusPage StatusPage Statuspage DigiCert DigiCert EV Certificate