A web spider for collecting specific data across a set of configured sites
Project description
Parker is a Python-based web spider for collecting specific data across a set of configured sites.
Non-Python requirements:
Redis - for task queuing and visit tracking
libxml - for HTML parsing of pages
Installation
Install using pip:
$ pip install parker
Configuration
To configure Parker, you will need to install the configuration files in a suitable location for the user running Parker. To do this, use the parker-config script. For example:
$ parker-config ~/.parker
This will install the configuration in your homedir and will output the related environment variable for you to set in your .bashrc.
Changes
0.5.0
Update ConsumeModel to post process the data. This enables us to populate specific data from a reference to a key-value field.
Reorder changes so newest first, and rename to “Changes” in the long description.
0.4.2
Bug fix to fix RST headers which may be the problem.
Remove the decode/encode which is not the issue.
0.4.1
Bug fix to see if RST in ASCII fixes issues on PyPI.
0.4.0
Added handling for a PARKER_CONFIG environment variable, allowing users to specify where configuration files are loaded from.
Added the parker-config script to install default configuration files to a passed location. Also prints out an example PARKER_CONFIG environment variable to add to your profile files.
Updated documentation to use proper reStructuredText files.
Add a CHANGES file to track updates.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file Parker-0.5.0.tar.gz
.
File metadata
- Download URL: Parker-0.5.0.tar.gz
- Upload date:
- Size: 138.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd017478a0cb0e6a328eb38f4033cf8d590dfcc27df76a96e7187de28504ccc8 |
|
MD5 | f099c5dfa43e94304ec179dfb55181f4 |
|
BLAKE2b-256 | 0653f8b48ac84d102295d2749c5ffa988511762e659adeb7fb6895cc4dac686f |
File details
Details for the file Parker-0.5.0-py2.py3-none-any.whl
.
File metadata
- Download URL: Parker-0.5.0-py2.py3-none-any.whl
- Upload date:
- Size: 17.3 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2899079ee297ec4de83c15359a6e20bef4339a19f3f6fb1a99de369d0f77aed7 |
|
MD5 | 0979f3e0a3d9093cb191fc5ee26fe5a3 |
|
BLAKE2b-256 | 4c645de86622d83d8a9a050683d0f5d5f9121d75ae1dbb6a47bc46ea7f324182 |