scrapy的一个下载中间件,绕过cloudflare检测
Project description
AroayCloudScraper
scrapy一个插件,绕过cloudflare检测,主要是封装 cloudscraper模块,并且将cloudscraper模块在scrapy中异步执行
使用,在setting添加
DOWNLOADER_MIDDLEWARES = {
'aroay_cloudscraper.downloadermiddlewares.CloudScraperMiddleware': 543,
}
需setting设置
# 默认日志级别
AROAY_CLOUDSCRAPER_LOGGING_LEVEL = logging.DEBUG
默认超时
AROAY_CLOUDSCRAPER_DOWNLOAD_TIMEOUT = 30
# 默认延迟
AROAY_CLOUDSCRAPER_DELAY = 1
#必须设置,否则报错
COMPRESSION_ENABLED = False
RETRY_ENABLED: True
RETRY_TIMES: 3
代理使用
def start_requests(self):
for page in range(1, 2):
yield CloudScraperRequest(self.base_url, callback=self.parse_index, dont_filter=True, proxy={
"http": "http://username:password@ip:port",
"https": "http://username:password@ip:port",
},cookies={"over18":"1"},timeout=5)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
aroay_cloudscraper-1.4.tar.gz
(4.2 kB
view hashes)
Built Distribution
Close
Hashes for aroay_cloudscraper-1.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 408491cea338cbd934ce915fec69d48dff36c31868284530f6ab1af6ae7bbe1f |
|
MD5 | 2c0b7771e9a41d227df1666c44578676 |
|
BLAKE2b-256 | 9e292973c723b57867bfe2faf210b1bd1390858b10f1f9328398e418b9682c75 |