implement QOS(TokenBucket) in scrapy download middleware
Project description
Scrapy-QOS
QOS components for Scrapy
Usage
Active the QosDownloaderMiddleware
in settings.py
DOWNLOADER_MIDDLEWARES = {
"scrapy_qos.QosDownloaderMiddleware": 543
}
Config following option in settings.py
- QOS_IOPS_ENABLED
- default
False
- set
True
to enable IOPS limiter
- default
- QOS_IOPS_CAPACITY
- default
1
- burst IO count per seconds
- default
- QOS_IOPS_LIMIT
- default
1
/ s - how many requests sent per seconds
- default
- QOS_BPS_ENABLED
- default
False
- set
True
to enable BPS limiter
- default
- QOS_BPS_CAPACITY
- default
1048576
Bytes - burst IO Bytes per seconds
- default
- QOS_BPS_LIMIT
- default
1048576
Bytes / s - how many response Bytes receive per seconds
- default
- QOS_SMALL_RESPONSE_SIZE
- default
1048576
Bytes - guess next response size filter response less than this value
- default
Requirements
- Python 3.7+
- Scrapy >= 2.0
- asyncio
Installation
From pip
pip install scrapy-qos
From Gitee
git clone https://gitee.com/hgdsdq/scrapy_qos.git
cd scrapy_qos
python setup.py install
Implementation
- Basic implement QOS with Token Bucket Algorithm
- For scrapy, QosDownloaderMiddleware will guess next response body size that used for BPS limiter
α = 0.8
guess_response_size = (1 - α) * guess_response_size + α * guess_response_size
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
scrapy-qos-0.0.2.tar.gz
(3.5 kB
view hashes)
Built Distribution
Close
Hashes for scrapy_qos-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 97051473db34c37e2b38ec4fb1e8d3da3a2b1390f735cd4dfed71acd0cfe0c43 |
|
MD5 | 4f65ff4930034fb5d2f21f9a6f025d25 |
|
BLAKE2b-256 | 0feb7c3dd1a218abb9463e058aadba796881fea99f554ff56a656f567192cb4f |