Scrapy Middleware to set a random User-Agent for every Request.
Project description
Does your scrapy spider get identified and blocked by servers because you use the default user-agent or a generic one?
Use this random_useragent module and set a random user-agent for every request.
Installing
Installing it is pretty simple.
pip install git+https://github.com/cleocn/scrapy-random-useragent.git
Usage
In your settings.py file, update the DOWNLOADER_MIDDLEWARES variable like this.
DOWNLOADER_MIDDLEWARES = {
'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': None,
'random_useragent.RandomUserAgentMiddleware': 400
}
This disables the default UserAgentMiddleware and enables the RandomUserAgentMiddleware.
Now all the requests from your crawler will have a random user-agent.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
scrapy-random-ua-0.3.tar.gz
(3.2 kB
view details)
File details
Details for the file scrapy-random-ua-0.3.tar.gz
.
File metadata
- Download URL: scrapy-random-ua-0.3.tar.gz
- Upload date:
- Size: 3.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
0b21843e0eb86eb7351c0762c1851343b865d4a0b750146250f0c86581e2b972
|
|
MD5 |
2ac2a82c06b11252e0ce1b724416aaab
|
|
BLAKE2b-256 |
0b6fd48d5c0f651e2ede25637ca91584550f793dc967f68f56e56541a154e216
|