Skip to main content
This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (
Help us improve Python packaging - Donate today!

Clean up the HTML formatting problems introduced by pasting content from MSWord into Plone's RichText fields.

Project Description


This product cleans up the HTML formatting problems that are introduced by pasting content from MSWord into Plone’s RichText fields.

Every time an object is created or edited, the HTML in its RichText fields will be sanitized.

The HTML sanitizing feature is turned on by default for all Archetype objects, but can be turned off on a per object basis by checking a box in the ‘settings’ fieldset of the default edit view.


This product provides an event subscriber for all BaseContent Archetypes objects that will clean up the HTML of all the RichText fields for each object.

The cleaning and sanitizing of the HTML code is mainly done by using the lxml library: by means of the htmllaundry package, written by Wichert Akkerman.


This Product does not have to be installed via quick_installer or the plone control panel.

Just add it to your buildout or install via easy_install.


1.2.8 (2010-12-21)

  • Use Wichert Akkerman’s htmllaundry utilities for code sanitizing. (jcbrand)

1.2.7 (2010-02-17)

  • Only create a new version if at_edit would not create one anyway (thomasw)

1.2.6 (2009-12-13)

  • Don’t force target=”_blank” on links. (thomasw)

1.2.5 (2009-12-07)

  • Set add_nofollow to False, since it seems to confuse lxmk.html parser (thomasw).

1.2.4 (2009-11-25)

  • Added LinguaPlone’s generateMethods magic, so that the languageIndependent field gets propagated to all translations when the canonical is edited (thomasw)

1.2.3 (2009-11-19)

  • don’t strip ‘h1’ and ‘h2’ (jcbrand)

1.2.2 (2009-11-16)

  • added a more flexible detection mechanism - encoding (pilz)

1.2.1 (2009-11-03)

  • I was too stupid to make a successful release, here we go again (thomasw)

1.2 (2009-11-03)

  • Bugfix in event-handler: don’t fail if cleanWordPastedText field isn’t present (thomasw)

1.1 - 2009-10-27

  • Add a new sanitize method and new helper methods. Thanks to Wichert.
  • Enable the cleaner by default. (jcbrand)

1.0 - 2009-10-23

  • Initial release (jcbrand)


Release History

Release History

This version
History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
slc.cleanwordpastedtext-1.2.8.tar.gz (17.8 kB) Copy SHA256 Checksum SHA256 Source Dec 21, 2010

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting