A terribly coded web spider, but useful for recursive API downloads.
Project description
# spidey.py
> Web spiders are usually disliked by websites, but useful for recursive API/page downloads for offline analysis.
## Installation
> Pypi Location: https://pypi.python.org/pypi/spidey
- Using Pypi - `pip install spidey`
## Usage
> Run `spidey` for Detailed help.
- `spidey --dir NEW_DIR --filter DOMAIN --url URL [--base BASE_URL]`
- `spidey --dir NEW_DIR --filter DOMAIN --url URL --max MAX_DOWNLOADS`
- Example - `spidey --dir test --filter 'www.google.com' --url 'https://www.google.com/' --max 20`
### More Examples
```
spidey \
-d test \
-f 'www.google.com' \
-u 'https://www.google.com/' \
-b 'https://www.google.com/' \
-hh '{"Accept" : "application/json"}' \
-n 2 \
-m 10 \
-s 5
```
```
spidey \
--dir test \
--filter 'www.google.com' \
--url 'https://www.google.com/'' \ \
--base 'https://www.google.com/
--headers '{"Accept" : "application/json"}' \
--depth 2 \
--max 10 \
--sleep 5
```
> Web spiders are usually disliked by websites, but useful for recursive API/page downloads for offline analysis.
## Installation
> Pypi Location: https://pypi.python.org/pypi/spidey
- Using Pypi - `pip install spidey`
## Usage
> Run `spidey` for Detailed help.
- `spidey --dir NEW_DIR --filter DOMAIN --url URL [--base BASE_URL]`
- `spidey --dir NEW_DIR --filter DOMAIN --url URL --max MAX_DOWNLOADS`
- Example - `spidey --dir test --filter 'www.google.com' --url 'https://www.google.com/' --max 20`
### More Examples
```
spidey \
-d test \
-f 'www.google.com' \
-u 'https://www.google.com/' \
-b 'https://www.google.com/' \
-hh '{"Accept" : "application/json"}' \
-n 2 \
-m 10 \
-s 5
```
```
spidey \
--dir test \
--filter 'www.google.com' \
--url 'https://www.google.com/'' \ \
--base 'https://www.google.com/
--headers '{"Accept" : "application/json"}' \
--depth 2 \
--max 10 \
--sleep 5
```
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
spidey.py-0.4.3.tar.gz
(4.0 kB
view hashes)
Built Distribution
Close
Hashes for spidey.py-0.4.3-py2-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ab7e48f0c8bb70c857fb84a66816114de19c269c0c5c192e9ab65b1f014ebee0 |
|
MD5 | 1953bac1e5f7052cd9f5d7e029cd8e9c |
|
BLAKE2b-256 | 459a4f98ad9ade58cbd188b5ae60f21e62223f34aed3cdd5421320253756de0e |