This is a pre-production deployment of Warehouse, however changes made here WILL affect the production instance of PyPI.
Latest Version Dependencies status unknown Test status unknown Test coverage unknown
Project Description


Quite often there is a need to clean up HTML from some source, be it user input or data gathered by scraping, which needs to be cleaned up. With the SoupStrainer class in collective.soupstrainer this is made easy. It uses BeautifulSoup to parse and clean up HTML. The constructor of the class takes three arguments.

This is a list of tuples with two items each. The first item is a list of tag names, the second item is a list of attributes. If the list of attributes is empty, then each tag in the first list is completely removed from the passed in HTML. If the list of tags is empty, then each attribute listed is completely removed. If there are both tags and attributes listed, then the attributes are only removed from matching tags.
This is a white list of CSS styles allowed in ‘style’ attributes. All other styles are removed.
This is a black list for CSS classes. Each matching class is removed from ‘class’ attributes.

An instance of the SoupStrainer class can be called directly with one argument. The argument can either be a string, in which case it will internally be parsed by BeautifulSoup and the result will be unicode, or it can be a parsed HTML tree created by BeautifulSoup, in which case it will be modified in place and be returned again.


1.0 - 2008-11-14

  • Initial release
Release History

Release History


This version

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

Download Files

Download Files

TODO: Brief introduction on what you do with files - including link to relevant help section.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date (14.8 kB) Copy SHA256 Checksum SHA256 Source Nov 14, 2008

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS HPE HPE Development Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting