Skip to main content

Page Object pattern for Scrapy

Project description

PyPI Version Supported Python Versions Build Status Coverage report Documentation Status

scrapy-poet is the web-poet Page Object pattern implementation for Scrapy. scrapy-poet allows to write spiders where extraction logic is separated from the crawling one. With scrapy-poet is possible to make a single spider that supports many sites with different layouts.

Read the documentation for more information.

License is BSD 3-clause.

Quick Start

Installation

pip install scrapy-poet

Requires Python 3.7+ and Scrapy >= 2.6.0.

Usage in a Scrapy Project

Add the following inside Scrapy’s settings.py file:

DOWNLOADER_MIDDLEWARES = {
    "scrapy_poet.InjectionMiddleware": 543,
}
SPIDER_MIDDLEWARES = {
    "scrapy_poet.RetryMiddleware": 275,
}

Developing

Setup your local Python environment via:

  1. pip install -r requirements-dev.txt

  2. pre-commit install

Now everytime you perform a git commit, these tools will run against the staged files:

  • black

  • isort

  • flake8

You can also directly invoke pre-commit run –all-files or tox -e linters to run them without performing a commit.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapy-poet-0.10.0.tar.gz (47.8 kB view details)

Uploaded Source

Built Distribution

scrapy_poet-0.10.0-py3-none-any.whl (26.3 kB view details)

Uploaded Python 3

File details

Details for the file scrapy-poet-0.10.0.tar.gz.

File metadata

  • Download URL: scrapy-poet-0.10.0.tar.gz
  • Upload date:
  • Size: 47.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for scrapy-poet-0.10.0.tar.gz
Algorithm Hash digest
SHA256 a4f41f036b6d5547381e3a27a9ec8d6a16c6f5e9dfc7ce13fe42f552cbcc6f8b
MD5 4ca80d0ee5ad2dce50736d1feb8668a2
BLAKE2b-256 46739f6329e1e41c6a1206122979dd9534605e42281548652d1c7d429a5ab277

See more details on using hashes here.

File details

Details for the file scrapy_poet-0.10.0-py3-none-any.whl.

File metadata

  • Download URL: scrapy_poet-0.10.0-py3-none-any.whl
  • Upload date:
  • Size: 26.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for scrapy_poet-0.10.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d3c1f011e65ae0f7673434088f981525abc2d2128bae4ca48e2256a652adb47f
MD5 5323d062d8e697a00f64771da95d57f7
BLAKE2b-256 2d7e78afac2dcd1f77bcea63b2c4b1adebc9d9d95d5226fe48d333080e224baa

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page