Skip to main content

Scrapy middleware for Zenscrape

Project description

Scrapy Zenscrape Middleware

Acknowledgements

Thanks to arimbr and ScrapingBee, this is adaptation of their work.

Installation

pip install scrapy-zenscrape

Configuration

Add your ZENSCRAPE_API_KEY and the ZenscrapeMiddleware to your project settings.py. Don't forget to set CONCURRENT_REQUESTS according to your Zenscrape plan.

ZENSCRAPE_API_KEY = 'REPLACE-WITH-YOUR-API-KEY'

DOWNLOADER_MIDDLEWARES = {
    'scrapy_zenscrape.ZenscrapeMiddleware': 700,
}

CONCURRENT_REQUESTS = 1

Usage

Inherit your spiders from ZenscrapeSpider and yield a ZenscrapeRequest.

Below you can see an example from the spider in httpbin.py.

from scrapy import Spider
from scrapy_zenscrape import ZenscrapeSpider, ZenscrapeRequest

class HttpbinSpider(Spider):
    name = 'httpbin'
    start_urls = [
        'https://httpbin.org',
    ]

    def start_requests(self):
        for url in self.start_urls:
            yield ScrapingBeeRequest(url, params={
                # 'render': False,
                # 'block_ads': True,
                # 'block_resources': False,
                # 'premium': True,
                # 'location': 'fr',
                # 'wait_for': 5,
                # 'wait_for_css': '#swagger-ui',
            },
            headers={
                # 'Accept-Language': 'En-US',
            },
            cookies={
                # 'name_1': 'value_1',
            })

    def parse(self, response):
        ...

You can pass Zenscrape parameters in the params argument of a ZenscrapeRequest. Headers and cookies are passed like a normal Scrapy Request. ZenscrapeRequests formats all parameters, headers and cookies to the format expected by the API.

Examples

Add your API key to settings.py.

To run the examples you need to clone this repository. In your terminal, go to examples/httpbin/httpbin and run the example spider with:

scrapy crawl httpbin

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapy-zenscrape-0.0.2.tar.gz (3.7 kB view details)

Uploaded Source

File details

Details for the file scrapy-zenscrape-0.0.2.tar.gz.

File metadata

  • Download URL: scrapy-zenscrape-0.0.2.tar.gz
  • Upload date:
  • Size: 3.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.5

File hashes

Hashes for scrapy-zenscrape-0.0.2.tar.gz
Algorithm Hash digest
SHA256 cb9914a189841dd83840a50139a3bd8eb13fe3952aff3ccb93beb09d3c9bf2b6
MD5 56333aaad870d20dbe00133eda6af66d
BLAKE2b-256 aea1a04432db9e931cd2c2f096c2d1b469212f419a425f2303d90911022ebbbf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page