Skip to main content
This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (
Help us improve Python packaging - Donate today!

Patu is a small spider

Project Description


A small spider, useful for checking a site for 404s and 500s. Patu requires httplib2 and lxml:

pip install -U httplib2 lxml

Quick Usage

To see available options: –help

To spider an entire site using 5 workers, only showing errors: –spiders=5

To spider, stopping after the first level of links: –depth=1

To get a list of every linked page on a site: –generate > urls.txt

Instead of spidering for URLs, use a file instead and show all responses: –input=urls.txt –verbose

Format of URLs File

The output produced by <code>–generate</code> is formatted like so:


<code>–input</code> can take a file of that format, or one URL per line with no referer. <code>–input=-</code> reads from stdin.


Patu uses Nose for testing. To install Nose and test:

pip install -U nose nosetests

Release History

This version
History Node


Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, Size & Hash SHA256 Hash Help File Type Python Version Upload Date
(9.1 kB) Copy SHA256 Hash SHA256
Source None May 19, 2010

Supported By

Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Google Google Cloud Servers DreamHost DreamHost Log Hosting