Skip to main content

Scrapy entrypoint for Estela job runner

Project description

estela Entrypoint

Code style: black version python-version

The package implements a wrapper layer to extract job data from environment, prepare the job properly, and execute it using Scrapy.

Entrypoints

  • estela-crawl: Process job args and settings to run the job with Scrapy.
  • estela-describe-project: Print JSON-encoded project information and image metadata.

Installation

$ python setup.py install 

Requirements

$ pip install -r requirements.txt

Environment variables

Job specifications are passed through env variables:

  • JOB_INFO: Dictionary with this fields:
    • [Required] key: Job key (job ID, spider ID and project ID).
    • [Required] spider: String spider name.
    • [Required] auth_token: User token authentication.
    • [Required] api_host: API host URL.
    • [Optional] args: Dictionary with job arguments.
    • [Required] collection: String with name of collection where items will be stored.
    • [Optional] unique: String, "True" if the data will be stored in a unique collection, "False" otherwise. Required only for cronjobs.
  • QUEUE_PLATFORM: The queue platform used by estela, review the list of the current supported platforms.
  • QUEUE_PLATFORM_{PARAMETERS}: Please, refer to the estela-queue-adapter documentation to declare the needed variables.

Testing

$ pytest

Formatting

$ black .

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

estela_entrypoint-0.2.1.tar.gz (13.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

estela_entrypoint-0.2.1-py3-none-any.whl (17.3 kB view details)

Uploaded Python 3

File details

Details for the file estela_entrypoint-0.2.1.tar.gz.

File metadata

  • Download URL: estela_entrypoint-0.2.1.tar.gz
  • Upload date:
  • Size: 13.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.9

File hashes

Hashes for estela_entrypoint-0.2.1.tar.gz
Algorithm Hash digest
SHA256 f48fd2f5c04d1683a4df142811cf652aa8394aecf94478bc6524cad171ea38fb
MD5 e0046e530cb287f255402cfa90c75ea9
BLAKE2b-256 103af87c3a54ff468053944f3f325399876b6590553f1ec3378400a7d1d053fe

See more details on using hashes here.

File details

Details for the file estela_entrypoint-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for estela_entrypoint-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 737142d5bbdf7d034eba42eb26e65d98d5c02cb83fb7e90328560c4541a821f6
MD5 e1891a5aea6ff48465d7652ec57ebe08
BLAKE2b-256 d4f0e7d15990a80718064b688a70b945beae3b46a1bfff26e5acb5f474f2a735

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page