A web spider for collecting specific data across a set of configured sites
Parker is a Python-based web spider for collecting specific data across a set of configured sites.
- Redis - for task queuing and visit tracking
- libxml - for HTML parsing of pages
Install using pip:
$ pip install parker
To configure Parker, you will need to install the configuration files in a suitable location for the user running Parker. To do this, use the parker-config script. For example:
$ parker-config ~/.parker
This will install the configuration in your homedir and will output the related environment variable for you to set in your .bashrc.
- Fix an issue with the order of key-value reference resolution that prevented the effective use of unique_field if using a field that was a kv_ref.
- Add some Parker specific configuration so we can specify where to download, in case the PROJECT env variable doesn’t exist.
- Update ConsumeModel to post process the data. This enables us to populate specific data from a reference to a key-value field.
- Reorder changes so newest first, and rename to “Changes” in the long description.
- Bug fix to fix RST headers which may be the problem.
- Remove the decode/encode which is not the issue.
- Bug fix to see if RST in ASCII fixes issues on PyPI.
- Added handling for a PARKER_CONFIG environment variable, allowing users to specify where configuration files are loaded from.
- Added the parker-config script to install default configuration files to a passed location. Also prints out an example PARKER_CONFIG environment variable to add to your profile files.
- Updated documentation to use proper reStructuredText files.
- Add a CHANGES file to track updates.
Release history Release notifications
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size & hash SHA256 hash help||File type||Python version||Upload date|
|Parker-0.5.1-py2.py3-none-any.whl (17.9 kB) Copy SHA256 hash SHA256||Wheel||2.7||Jul 20, 2014|
|Parker-0.5.1.tar.gz (138.7 kB) Copy SHA256 hash SHA256||Source||None||Jul 20, 2014|