Palu is a small spider, a forked of patu.
Project description
Palu
A small spider, useful for checking a site for 404s and 500s. It’s a forked of [Patu][1].Palu requires httplib2 and lxml:
pip install -U httplib2 lxml
Is it safe? [](http://travis-ci.org/akrito/palu)
Quick Usage
To see available options:
palu.py –help
To spider an entire site using 5 workers, only showing errors:
palu.py –spiders=5 www.example.com
To spider, stopping after the first level of links:
palu.py –depth=1 www.example.com
To get a list of every linked page on a site:
palu.py –generate www.example.com > urls.txt
Instead of spidering for URLs, use a file instead and show all responses:
palu.py –input=urls.txt –verbose www.example.com
Format of URLs File
The output produced by <code>–generate</code> is formatted like so:
FIRST_URL<TAB>None LINK1<TAB>REFERER LINK2<TAB>REFERER
<code>–input</code> can take a file of that format, or one URL per line with no referer. <code>–input=-</code> reads from stdin.
Testing
Palu uses Nose for testing. To install Nose and test:
pip install -U nose nosetests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file palu-0.1.tar.gz
.
File metadata
- Download URL: palu-0.1.tar.gz
- Upload date:
- Size: 9.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
1173921168bf427495e2d431ec50522bf80240d42c6872c3213d08abb7defa74
|
|
MD5 |
27e0d848f3f1fa580f1c1158236820e1
|
|
BLAKE2b-256 |
42e0dee0cd6f7486c0adef86a8db0558b6aa70943b06e6487d46376a62737bba
|