implement QOS(TokenBucket) in scrapy download middleware
Project description
Scrapy-QOS
QOS components for Scrapy
Usage
Active the QosDownloaderMiddleware
in settings.py
DOWNLOADER_MIDDLEWARES = {
"scrapy_qos.QosDownloaderMiddleware": 543
}
Config following option in settings.py
- QOS_IOPS_ENABLED
- default
False
- set
True
to enable IOPS limiter
- default
- QOS_IOPS_CAPACITY
- default
1
- burst IO count per seconds
- default
- QOS_IOPS_LIMIT
- default
1
/ s - how many requests sent per seconds
- default
- QOS_BPS_ENABLED
- default
False
- set
True
to enable BPS limiter
- default
- QOS_BPS_CAPACITY
- default
1048576
Bytes - burst IO Bytes per seconds
- default
- QOS_BPS_LIMIT
- default
1048576
Bytes / s - how many response Bytes receive per seconds
- default
- QOS_SMALL_RESPONSE_SIZE
- default
1048576
Bytes - guess next response size filter response less than this value
- default
Requirements
- Python 3.7+
- Scrapy >= 2.0
- asyncio
Installation
From pip
pip install scrapy-qos
From Gitee
git clone https://gitee.com/hgdsdq/scrapy_qos.git
cd scrapy_qos
python setup.py install
Implementation
- Basic implement QOS with Token Bucket Algorithm
- For scrapy, QosDownloaderMiddleware will guess next response body size that used for BPS limiter
α = 0.8
guess_response_size = (1 - α) * guess_response_size + α * guess_response_size
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
scrapy-qos-0.0.1.tar.gz
(3.4 kB
view hashes)
Built Distribution
Close
Hashes for scrapy_qos-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | aba6a790c7a4c2aa83091fb71e09675861e1581cdb961cf7e0d5cef21e4006e4 |
|
MD5 | 283070c150c3863e0c65400815cd1875 |
|
BLAKE2b-256 | afb066e220cf5ee8da2b26d1559f1b31799c4480448ca66894cad17260206b84 |