This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (pypi.python.org).
Help us improve Python packaging - Donate today!

Palu is a small spider, a forked of patu.

Project Description

Palu

A small spider, useful for checking a site for 404s and 500s. It’s a forked of [Patu][1].Palu requires httplib2 and lxml:

pip install -U httplib2 lxml

Is it safe? [![Build Status](https://secure.travis-ci.org/akrito/palu.png?branch=master)](http://travis-ci.org/akrito/palu)

Quick Usage

To see available options:

palu.py –help

To spider an entire site using 5 workers, only showing errors:

palu.py –spiders=5 www.example.com

To spider, stopping after the first level of links:

palu.py –depth=1 www.example.com

To get a list of every linked page on a site:

palu.py –generate www.example.com > urls.txt

Instead of spidering for URLs, use a file instead and show all responses:

palu.py –input=urls.txt –verbose www.example.com

Format of URLs File

The output produced by <code>–generate</code> is formatted like so:

FIRST_URL<TAB>None LINK1<TAB>REFERER LINK2<TAB>REFERER

<code>–input</code> can take a file of that format, or one URL per line with no referer. <code>–input=-</code> reads from stdin.

Testing

Palu uses Nose for testing. To install Nose and test:

pip install -U nose nosetests

[1]:https://pypi.python.org/pypi/patu

Release History

Release History

This version
History Node

0.1

Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
palu-0.1.tar.gz (9.4 kB) Copy SHA256 Checksum SHA256 Source May 9, 2013

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting