Anti-detection Scrapy middleware — proxy routing and browser rendering for web scraping
Project description
scrapy-calyprium
Anti-detection Scrapy middleware for web scraping — proxy routing and stealth browser rendering powered by Calyprium.
Install
pip install scrapy-calyprium
Quick Start
# settings.py
import scrapy_calyprium
scrapy_calyprium.configure(api_key="clp_your_key_here")
This auto-configures:
- VeilProxyMiddleware — routes requests through rotating proxies with TLS fingerprinting
- MimicBrowserMiddleware — renders JavaScript pages with stealth browser instances
- S3 feed storage — write spider output to Calyprium storage using Scrapy's built-in
S3FeedStorage
Usage
Automatic Configuration (recommended)
# settings.py
import scrapy_calyprium
scrapy_calyprium.configure(
api_key="clp_your_key_here",
mimic_stealth_level="maximum", # basic, moderate, maximum
)
Manual Configuration
# settings.py
DOWNLOADER_MIDDLEWARES = {
"scrapy_calyprium.VeilProxyMiddleware": 100,
"scrapy_calyprium.MimicBrowserMiddleware": 200,
}
CALYPRIUM_API_KEY = "clp_your_key_here"
VEIL_USER_ID = "your-user-id"
Saving Output to Calyprium Storage
Spider output is saved to Calyprium's S3-compatible storage using Scrapy's built-in feed export:
# settings.py
import scrapy_calyprium
scrapy_calyprium.configure(api_key="clp_your_key_here")
FEEDS = {
"s3://calyprium/my-spider/%(time)s.jl": {
"format": "jsonlines",
},
}
The S3 credentials are auto-configured by configure() — no additional setup needed.
Browser Rendering
Mark requests that need JavaScript rendering:
import scrapy
class MySpider(scrapy.Spider):
name = "example"
def start_requests(self):
# Regular request (proxy only)
yield scrapy.Request("https://example.com")
# Browser-rendered request
yield scrapy.Request(
"https://example.com/spa",
meta={"mimic": True},
)
Authentication
All middleware requires a valid API key. Set it via:
scrapy_calyprium.configure(api_key="clp_...")CALYPRIUM_API_KEYenvironment variable
Settings Reference
| Setting | Description | Default |
|---|---|---|
CALYPRIUM_API_KEY |
API key for all services | — |
VEIL_GATEWAY_URL |
Proxy gateway URL | https://proxy.calyprium.com |
VEIL_USER_ID |
User ID for proxy routing | — |
VEIL_PROFILE |
Proxy routing profile | — |
VEIL_PROXY_TYPE |
datacenter, residential, residential_rotating |
— |
MIMIC_SERVICE_URL |
Mimic browser service URL | https://mimic.calyprium.com |
MIMIC_STEALTH_LEVEL |
basic, moderate, maximum |
moderate |
MIMIC_BROWSER_ENGINE |
Specific browser engine | auto |
MIMIC_USE_PROXY |
Route browser through proxy | False |
MIMIC_ALL_REQUESTS |
Render all requests via browser | False |
MIMIC_USE_SPECTRE |
Use device fingerprints | True |
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scrapy_calyprium-1.4.0.tar.gz.
File metadata
- Download URL: scrapy_calyprium-1.4.0.tar.gz
- Upload date:
- Size: 22.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c015a0b7a4f9eaa1ad83cb89857c72f95a68aff201b861cd31c44cd8310af32a
|
|
| MD5 |
27517ac9115f32fabec88ebc4ca3ace5
|
|
| BLAKE2b-256 |
badde75fbc817a865db487922932effc1d7785b89fb8abee3b93995ea7e383e8
|
Provenance
The following attestation bundles were made for scrapy_calyprium-1.4.0.tar.gz:
Publisher:
publish.yml on Aarkc/scrapy-calyprium
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
scrapy_calyprium-1.4.0.tar.gz -
Subject digest:
c015a0b7a4f9eaa1ad83cb89857c72f95a68aff201b861cd31c44cd8310af32a - Sigstore transparency entry: 1216204528
- Sigstore integration time:
-
Permalink:
Aarkc/scrapy-calyprium@afa8b790df9695cd1539686a855bab6393536cae -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Aarkc
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@afa8b790df9695cd1539686a855bab6393536cae -
Trigger Event:
push
-
Statement type:
File details
Details for the file scrapy_calyprium-1.4.0-py3-none-any.whl.
File metadata
- Download URL: scrapy_calyprium-1.4.0-py3-none-any.whl
- Upload date:
- Size: 23.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d5b04d839e970309ebf50f676b84e604225589dc01cc499d89ee43b359befb68
|
|
| MD5 |
49bd283dfd4233e2f9b6489a6c1e3efd
|
|
| BLAKE2b-256 |
b51d478646a8e9cbba6570f6ca6c17c0bcd42b57c0baa7c72f55de5ae5f7c4b4
|
Provenance
The following attestation bundles were made for scrapy_calyprium-1.4.0-py3-none-any.whl:
Publisher:
publish.yml on Aarkc/scrapy-calyprium
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
scrapy_calyprium-1.4.0-py3-none-any.whl -
Subject digest:
d5b04d839e970309ebf50f676b84e604225589dc01cc499d89ee43b359befb68 - Sigstore transparency entry: 1216204638
- Sigstore integration time:
-
Permalink:
Aarkc/scrapy-calyprium@afa8b790df9695cd1539686a855bab6393536cae -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Aarkc
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@afa8b790df9695cd1539686a855bab6393536cae -
Trigger Event:
push
-
Statement type: