Skip to main content

Anti-detection Scrapy middleware — proxy routing and browser rendering for web scraping

Project description

scrapy-calyprium

Anti-detection Scrapy middleware for web scraping — proxy routing and stealth browser rendering powered by Calyprium.

Install

pip install scrapy-calyprium

Quick Start

# settings.py
import scrapy_calyprium

scrapy_calyprium.configure(api_key="clp_your_key_here")

This auto-configures:

  • VeilProxyMiddleware — routes requests through rotating proxies with TLS fingerprinting
  • MimicBrowserMiddleware — renders JavaScript pages with stealth browser instances
  • S3 feed storage — write spider output to Calyprium storage using Scrapy's built-in S3FeedStorage

Usage

Automatic Configuration (recommended)

# settings.py
import scrapy_calyprium

scrapy_calyprium.configure(
    api_key="clp_your_key_here",
    mimic_stealth_level="maximum",  # basic, moderate, maximum
)

Manual Configuration

# settings.py
DOWNLOADER_MIDDLEWARES = {
    "scrapy_calyprium.VeilProxyMiddleware": 100,
    "scrapy_calyprium.MimicBrowserMiddleware": 200,
}

CALYPRIUM_API_KEY = "clp_your_key_here"
VEIL_USER_ID = "your-user-id"

Saving Output to Calyprium Storage

Spider output is saved to Calyprium's S3-compatible storage using Scrapy's built-in feed export:

# settings.py
import scrapy_calyprium

scrapy_calyprium.configure(api_key="clp_your_key_here")

FEEDS = {
    "s3://calyprium/my-spider/%(time)s.jl": {
        "format": "jsonlines",
    },
}

The S3 credentials are auto-configured by configure() — no additional setup needed.

Browser Rendering

Mark requests that need JavaScript rendering:

import scrapy

class MySpider(scrapy.Spider):
    name = "example"

    def start_requests(self):
        # Regular request (proxy only)
        yield scrapy.Request("https://example.com")

        # Browser-rendered request
        yield scrapy.Request(
            "https://example.com/spa",
            meta={"mimic": True},
        )

Authentication

All middleware requires a valid API key. Set it via:

  1. scrapy_calyprium.configure(api_key="clp_...")
  2. CALYPRIUM_API_KEY environment variable

Settings Reference

Setting Description Default
CALYPRIUM_API_KEY API key for all services
VEIL_GATEWAY_URL Proxy gateway URL https://proxy.calyprium.com
VEIL_USER_ID User ID for proxy routing
VEIL_PROFILE Proxy routing profile
VEIL_PROXY_TYPE datacenter, residential, residential_rotating
MIMIC_SERVICE_URL Mimic browser service URL https://mimic.calyprium.com
MIMIC_STEALTH_LEVEL basic, moderate, maximum moderate
MIMIC_BROWSER_ENGINE Specific browser engine auto
MIMIC_USE_PROXY Route browser through proxy False
MIMIC_ALL_REQUESTS Render all requests via browser False
MIMIC_USE_SPECTRE Use device fingerprints True

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapy_calyprium-1.4.1.tar.gz (22.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scrapy_calyprium-1.4.1-py3-none-any.whl (24.1 kB view details)

Uploaded Python 3

File details

Details for the file scrapy_calyprium-1.4.1.tar.gz.

File metadata

  • Download URL: scrapy_calyprium-1.4.1.tar.gz
  • Upload date:
  • Size: 22.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scrapy_calyprium-1.4.1.tar.gz
Algorithm Hash digest
SHA256 904e7f68d1e2a109e3095b30d4580bf6ea95bd9204e46d30d420606c7669085f
MD5 12856b9b3d98f8e6746af4aea6104e9f
BLAKE2b-256 5f3c9deb2a992f2605b3c7878d756e4f26b67f7570aa7ee5d36690d4b3984607

See more details on using hashes here.

Provenance

The following attestation bundles were made for scrapy_calyprium-1.4.1.tar.gz:

Publisher: publish.yml on Aarkc/scrapy-calyprium

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file scrapy_calyprium-1.4.1-py3-none-any.whl.

File metadata

File hashes

Hashes for scrapy_calyprium-1.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 45770b63ab9c4ccd93af36ee708618c35a0f0af5ce71557e5aee9b349ac45d40
MD5 7fd6a7554098a9f196384faa5b6f95ac
BLAKE2b-256 5f6f1a4ab3aaf1f8548835dfe7e5f4bec8b5eeeb3402eddfdc4a83cdf28e6f93

See more details on using hashes here.

Provenance

The following attestation bundles were made for scrapy_calyprium-1.4.1-py3-none-any.whl:

Publisher: publish.yml on Aarkc/scrapy-calyprium

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page