Skip to main content

A web spider for collecting specific data across a set of configured sites

Project description

Parker is a Python-based web spider for collecting specific data across a set of configured sites.

Non-Python requirements:

  • Redis - for task queuing and visit tracking

  • libxml - for HTML parsing of pages

Installation

Install using pip:

$ pip install parker

Configuration

To configure Parker, you will need to install the configuration files in a suitable location for the user running Parker. To do this, use the parker-config script. For example:

$ parker-config ~/.parker

This will install the configuration in your homedir and will output the related environment variable for you to set in your .bashrc.

Changes

0.5.0

  • Update ConsumeModel to post process the data. This enables us to populate specific data from a reference to a key-value field.

  • Reorder changes so newest first, and rename to “Changes” in the long description.

0.4.2

  • Bug fix to fix RST headers which may be the problem.

  • Remove the decode/encode which is not the issue.

0.4.1

  • Bug fix to see if RST in ASCII fixes issues on PyPI.

0.4.0

  • Added handling for a PARKER_CONFIG environment variable, allowing users to specify where configuration files are loaded from.

  • Added the parker-config script to install default configuration files to a passed location. Also prints out an example PARKER_CONFIG environment variable to add to your profile files.

  • Updated documentation to use proper reStructuredText files.

  • Add a CHANGES file to track updates.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Parker-0.5.0.tar.gz (138.2 kB view details)

Uploaded Source

Built Distribution

Parker-0.5.0-py2.py3-none-any.whl (17.3 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file Parker-0.5.0.tar.gz.

File metadata

  • Download URL: Parker-0.5.0.tar.gz
  • Upload date:
  • Size: 138.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for Parker-0.5.0.tar.gz
Algorithm Hash digest
SHA256 cd017478a0cb0e6a328eb38f4033cf8d590dfcc27df76a96e7187de28504ccc8
MD5 f099c5dfa43e94304ec179dfb55181f4
BLAKE2b-256 0653f8b48ac84d102295d2749c5ffa988511762e659adeb7fb6895cc4dac686f

See more details on using hashes here.

File details

Details for the file Parker-0.5.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for Parker-0.5.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 2899079ee297ec4de83c15359a6e20bef4339a19f3f6fb1a99de369d0f77aed7
MD5 0979f3e0a3d9093cb191fc5ee26fe5a3
BLAKE2b-256 4c645de86622d83d8a9a050683d0f5d5f9121d75ae1dbb6a47bc46ea7f324182

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page