Skip to main content

Scrapy entrypoint for Estela job runner

Project description

estela Entrypoint

Code style: black version python-version

The package implements a wrapper layer to extract job data from environment, prepare the job properly, and execute it using Scrapy.

Entrypoints

  • estela-crawl: Process job args and settings to run the job with Scrapy.
  • estela-describe-project: Print JSON-encoded project information and image metadata.

Installation

$ python setup.py install 

Requirements

$ pip install -r requirements.txt

Environment variables

Job specifications are passed through env variables:

  • JOB_INFO: Dictionary with this fields:
    • [Required] key: Job key (job ID, spider ID and project ID).
    • [Required] spider: String spider name.
    • [Required] auth_token: User token authentication.
    • [Required] api_host: API host URL.
    • [Optional] args: Dictionary with job arguments.
    • [Required] collection: String with name of collection where items will be stored.
    • [Optional] unique: String, "True" if the data will be stored in a unique collection, "False" otherwise. Required only for cronjobs.
  • QUEUE_PLATFORM: The queue platform used by estela, review the list of the current supported platforms.
  • QUEUE_PLATFORM_{PARAMETERS}: Please, refer to the estela-queue-adapter documentation to declare the needed variables.

Testing

$ pytest

Formatting

$ black .

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

estela_entrypoint-0.2.0.tar.gz (13.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

estela_entrypoint-0.2.0-py3-none-any.whl (17.3 kB view details)

Uploaded Python 3

File details

Details for the file estela_entrypoint-0.2.0.tar.gz.

File metadata

  • Download URL: estela_entrypoint-0.2.0.tar.gz
  • Upload date:
  • Size: 13.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.9

File hashes

Hashes for estela_entrypoint-0.2.0.tar.gz
Algorithm Hash digest
SHA256 fd90668a9a2142bcdbeb3c960bde9d2d2750848da4186687983daaa532f374eb
MD5 9842261bee6924b34e0b61017b13013b
BLAKE2b-256 f75f87e58eecad6a622f1c8648969377e94103212b5045772af168bf27543dc6

See more details on using hashes here.

File details

Details for the file estela_entrypoint-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for estela_entrypoint-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6a339f8dda0a79b3fdc2921c3e713cb6364db6cb34457e98775e4e0f9c82a46f
MD5 50686d4cf4c23612991dd522fe4d0156
BLAKE2b-256 24aac903483f478925a63f3f3fa78f6dd95d3ef4d0d1bf46e92d618b4002f43c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page