Skip to main content

A web spider for collecting specific data across a set of configured sites

Project description

Parker is a Python-based web spider for collecting specific data across a set of configured sites.

Non-Python requirements:

  • Redis - for task queuing and visit tracking

  • libxml - for HTML parsing of pages

Installation

Install using pip:

$ pip install parker

Configuration

To configure Parker, you will need to install the configuration files in a suitable location for the user running Parker. To do this, use the parker-config script. For example:

$ parker-config ~/.parker

This will install the configuration in your homedir and will output the related environment variable for you to set in your .bashrc.

News

0.4.0

  • Added handling for a PARKER_CONFIG environment variable, allowing users to specify where configuration files are loaded from.

  • Added the parker-config script to install default configuration files to a passed location. Also prints out an example PARKER_CONFIG environment variable to add to your profile files.

  • Updated documentation to use proper reStructuredText files.

  • Add a CHANGES file to track updates.

0.4.1

  • Bug fix to see if RST in ASCII fixes issues on PyPI.

0.4.2

  • Bug fix to fix RST headers which may be the problem.

  • Remove the decode/encode which is not the issue.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Parker-0.4.2.tar.gz (137.8 kB view details)

Uploaded Source

Built Distribution

Parker-0.4.2-py2.py3-none-any.whl (17.0 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file Parker-0.4.2.tar.gz.

File metadata

  • Download URL: Parker-0.4.2.tar.gz
  • Upload date:
  • Size: 137.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for Parker-0.4.2.tar.gz
Algorithm Hash digest
SHA256 1dbb9b52a006307591806e8c8b1748de985d02a67ba0b990ea81bf29a687359e
MD5 2c854258ad9fefc4c06387d4a4e48f47
BLAKE2b-256 73e52410f8f554d4c89a94e8e1d36bc256023e8a6e39c302a70ef18137fa463a

See more details on using hashes here.

File details

Details for the file Parker-0.4.2-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for Parker-0.4.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 3cddee3b29c7d6412ba7f037cc0c03b842535b2e1fe1d8abb4a87d3eaed448db
MD5 1fe7e3f7da192684dd64bbc4e5572ff9
BLAKE2b-256 9a68ad1e94e1c917e47636304691f5df8d41eaf42deba9165590e955da7a2d12

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page