Skip to main content
Help the Python Software Foundation raise $60,000 USD by December 31st!  Building the PSF Q4 Fundraiser

Truncating HTML with html5lib filter

Project description

html5lib-truncation is a html5lib filter implementation, which can truncate HTML to specific length in display, but never breaks HTML tags.

There is a shortcut function, the simplest way to use it:

>>> from html5lib_truncation import truncate_html
>>> html = u'<p>A <a href="#">very very long link</a></p>'
>>> truncate_html(html, 8)
u'<p>A <a href=#>very</a>'
>>> truncate_html(html, 8, break_words=True)
u'<p>A <a href=#>very ve</a>'
>>> truncate_html(html, 20, end='...')
u'<p>A <a href=#>very very...</a>'
>>> truncate_html(html, 20, end='...', break_words=True)
u'<p>A <a href=#>very very lon...</a>'


pip install html5lib-truncation

Don’t forget to put it into your requirements.txt or

API Overview

The core API of html5lib-truncation is the filter:

import html5lib
from html5lib_truncation import TruncationFilter

etree = html5lib.parse(u'<p>A <a href="#">very very long link</a></p>')
walker = html5lib.getTreeWalker('etree')

stream = walker(etree)
stream = TruncationFilter(stream, 20, end='...', break_words=True)

serializer = html5lib.serializer.HTMLSerializer()
serialized = serializer.serialize(stream)


The output is <p>A <a href=#>very very lon...</a>.


If you want to report bugs or other issues, please create issues on GitHub Issues.


You can send a pull reueqst on GitHub.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for html5lib-truncation, version 0.1.0
Filename, size File type Python version Upload date Hashes
Filename, size html5lib_truncation-0.1.0-py2.py3-none-any.whl (8.0 kB) File type Wheel Python version py2.py3 Upload date Hashes View
Filename, size html5lib-truncation-0.1.0.tar.gz (4.4 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page