Skip to main content

A web spider for collecting specific data across a set of configured sites

Project description

Parker is a Python-based web spider for collecting specific data across a set of configured sites.

Non-Python requirements:

  • Redis - for task queuing and visit tracking

  • libxml - for HTML parsing of pages

Installation

Install using pip:

$ pip install parker

Configuration

To configure Parker, you will need to install the configuration files in a suitable location for the user running Parker. To do this, use the parker-config script. For example:

$ parker-config ~/.parker

This will install the configuration in your homedir and will output the related environment variable for you to set in your .bashrc.

Changes

0.5.1

  • Fix an issue with the order of key-value reference resolution that prevented the effective use of unique_field if using a field that was a kv_ref.

  • Add some Parker specific configuration so we can specify where to download, in case the PROJECT env variable doesn’t exist.

0.5.0

  • Update ConsumeModel to post process the data. This enables us to populate specific data from a reference to a key-value field.

  • Reorder changes so newest first, and rename to “Changes” in the long description.

0.4.2

  • Bug fix to fix RST headers which may be the problem.

  • Remove the decode/encode which is not the issue.

0.4.1

  • Bug fix to see if RST in ASCII fixes issues on PyPI.

0.4.0

  • Added handling for a PARKER_CONFIG environment variable, allowing users to specify where configuration files are loaded from.

  • Added the parker-config script to install default configuration files to a passed location. Also prints out an example PARKER_CONFIG environment variable to add to your profile files.

  • Updated documentation to use proper reStructuredText files.

  • Add a CHANGES file to track updates.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Parker-0.5.1.tar.gz (138.7 kB view details)

Uploaded Source

Built Distribution

Parker-0.5.1-py2.py3-none-any.whl (17.9 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file Parker-0.5.1.tar.gz.

File metadata

  • Download URL: Parker-0.5.1.tar.gz
  • Upload date:
  • Size: 138.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for Parker-0.5.1.tar.gz
Algorithm Hash digest
SHA256 4f984a24a745472ed76af19a0f64b4517badc3f321e4d9705f0d198d828e2001
MD5 656afaaf4d9db1cd84b9d610842b7292
BLAKE2b-256 348755b244c09c8ee4aabb273040bebdca5eaa77f7d542eeb948a3adfde02cbc

See more details on using hashes here.

File details

Details for the file Parker-0.5.1-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for Parker-0.5.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 9905b4c593ce6b18ec8f12798fad8b2758aa174cdccfa452b4aa38bbc947d9b4
MD5 2f2b3ea04af442d4dc25f224ff7264a3
BLAKE2b-256 b9906d406b9a8ccd0013c78516f43da784a442b7d64fd6e62894c5bd6cd374dc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page