Skip to main content

Scrapy downloader middleware to interact with agentfive API

Project description

This package provides a Scrapy Downloader Middleware to interact with the agentfive API.

Requirements

  • Python 3.5+
  • Scrapy 1.6+

Installation

pip install scrapy-agentfive-middleware

Configuration

Enable the AgentfiveMiddleware via the DOWNLOADER_MIDDLEWARES setting:

DOWNLOADER_MIDDLEWARES = {
    "agentfive_middleware.AgentfiveMiddleware": 585,
}

Please note that the middleware needs to be placed before the built-in HttpCompressionMiddleware middleware (which has a priority of 590), otherwise incoming responses will be compressed and the agentfive middleware won't be able to handle them.

Settings

  • AGENTFIVE_KEY (type str)

    API key to be used to authenticate against the agentfive API.

  • AGENTFIVE_API_URL (Type str, default "https://api.agentfive.cn/v1")

    The endpoint of a agentfive API.

  • AGENTFIVE_DEFAULT_ARGS (type dict, default {})

    Default values to be sent to the agentfive API. For instance, set to {"profile": "mobile"} to set all requests with a mobile profile.

Usage

If the middleware is enabled, by default all requests will be redirected to the specified agentfive API endpoint, and append necessary params which agentfive API expected.

For example:

scrapy.Request(url="https://httpbin.org/anything")

will be set to agentfive API and let agentfive to fetch the url.

Additional arguments

Additional arguments could be specified under the agentfive Request.meta key. For instance:

Request(
    url="https://example.org",
    meta={"crawlera_fetch": {"render": True, "wait_ms": 5000}},
)

for more information on agentfive API parameter, please refer to agentfive document.

Skipping requests

You can instruct the middleware to skip a specific request by setting the agentfive.skip Request.meta key:

Request(
    url="https://example.org",
    meta={"agentfive": {"skip": True}},
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapy-agentfive-middleware-0.0.2.tar.gz (3.4 kB view hashes)

Uploaded source

Built Distribution

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page