Skip to main content

Page Object pattern for Scrapy

Project description

PyPI Version Supported Python Versions Build Status Coverage report Documentation Status

scrapy-poet is the web-poet Page Object pattern implementation for Scrapy. scrapy-poet allows to write spiders where extraction logic is separated from the crawling one. With scrapy-poet is possible to make a single spider that supports many sites with different layouts.

Read the documentation for more information.

License is BSD 3-clause.

Quick Start

Installation

pip install scrapy-poet

Requires Python 3.7+ and Scrapy >= 2.6.0.

Usage in a Scrapy Project

Add the following inside Scrapy’s settings.py file:

DOWNLOADER_MIDDLEWARES = {
    "scrapy_poet.InjectionMiddleware": 543,
}
SPIDER_MIDDLEWARES = {
    "scrapy_poet.RetryMiddleware": 275,
}

Developing

Setup your local Python environment via:

  1. pip install -r requirements-dev.txt

  2. pre-commit install

Now everytime you perform a git commit, these tools will run against the staged files:

  • black

  • isort

  • flake8

You can also directly invoke pre-commit run –all-files or tox -e linters to run them without performing a commit.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapy-poet-0.12.0.tar.gz (49.0 kB view details)

Uploaded Source

Built Distribution

scrapy_poet-0.12.0-py3-none-any.whl (27.1 kB view details)

Uploaded Python 3

File details

Details for the file scrapy-poet-0.12.0.tar.gz.

File metadata

  • Download URL: scrapy-poet-0.12.0.tar.gz
  • Upload date:
  • Size: 49.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.3

File hashes

Hashes for scrapy-poet-0.12.0.tar.gz
Algorithm Hash digest
SHA256 70fab8e48ffc8a3c6acba7e879b0365fe3a1aac1eef7182c6e32c03e62053ca1
MD5 61fa0711d54f2aa5c5610644d1e207ef
BLAKE2b-256 e9e33b71cddd4abac7a2246bafe6b2ebb1d5b812841d8d9377a9c867a91bf596

See more details on using hashes here.

File details

Details for the file scrapy_poet-0.12.0-py3-none-any.whl.

File metadata

  • Download URL: scrapy_poet-0.12.0-py3-none-any.whl
  • Upload date:
  • Size: 27.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.3

File hashes

Hashes for scrapy_poet-0.12.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dd3e9180396464e577a65f77b864c683f3c04255616e58a8f4e3eecffae59e81
MD5 20ad83726040914ce57399818693f69f
BLAKE2b-256 f7a295e416d058565d259e911a83d7da3ef852a291ff7dc07d0dc8c98d06ec9f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page