Skip to main content
This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (
Help us improve Python packaging - Donate today!

Web crawler and HTML layout analyzer

Project Description

Webstemmer is a web crawler and HTML layout analyzer that automatically extracts main text of a news site without having banners, ads and/or navigation links mixed up.

Release History

This version
History Node


History Node


History Node


History Node


History Node


History Node


Supported By

Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Kabu Creative Kabu Creative UX & Design Google Google Cloud Servers Fastly Fastly CDN StatusPage StatusPage Statuspage DigiCert DigiCert EV Certificate