Skip to main content
This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (
Help us improve Python packaging - Donate today!

Truncating HTML with html5lib filter

Project Description

html5lib-truncation is a html5lib filter implementation, which can truncate HTML to specific length in display, but never breaks HTML tags.

There is a shortcut function, the simplest way to use it:

>>> from html5lib_truncation import truncate_html
>>> html = u'<p>A <a href="#">very very long link</a></p>'
>>> truncate_html(html, 8)
u'<p>A <a href=#>very</a>'
>>> truncate_html(html, 8, break_words=True)
u'<p>A <a href=#>very ve</a>'
>>> truncate_html(html, 20, end='...')
u'<p>A <a href=#>very very...</a>'
>>> truncate_html(html, 20, end='...', break_words=True)
u'<p>A <a href=#>very very lon...</a>'


pip install html5lib-truncation

Don’t forget to put it into your requirements.txt or

API Overview

The core API of html5lib-truncation is the filter:

import html5lib
from html5lib_truncation import TruncationFilter

etree = html5lib.parse(u'<p>A <a href="#">very very long link</a></p>')
walker = html5lib.getTreeWalker('etree')

stream = walker(etree)
stream = TruncationFilter(stream, 20, end='...', break_words=True)

serializer = html5lib.serializer.HTMLSerializer()
serialized = serializer.serialize(stream)


The output is <p>A <a href=#>very very lon...</a>.


If you want to report bugs or other issues, please create issues on GitHub Issues.


You can send a pull reueqst on GitHub.

Release History

Release History

This version
History Node


Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
html5lib_truncation-0.1.0-py2.py3-none-any.whl (8.0 kB) Copy SHA256 Checksum SHA256 py2.py3 Wheel Mar 10, 2015
html5lib-truncation-0.1.0.tar.gz (4.4 kB) Copy SHA256 Checksum SHA256 Source Mar 10, 2015

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting