scrapy-user-agents·PyPI

Automatically pick an User-Agent for every request

These details have not been verified by PyPI

Project links

Homepage

Project description

Random User-Agent middleware picks up User-Agent strings based on Python User Agents and MDN.

Installation

The simplest way is to install it via pip:

pip install scrapy-user-agents

Configuration

Turn off the built-in UserAgentMiddleware and add RandomUserAgentMiddleware.

In Scrapy >=1.0:

DOWNLOADER_MIDDLEWARES = {
    'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware': None,
    'scrapy_user_agents.middlewares.RandomUserAgentMiddleware': 400,
}

In Scrapy <1.0:

DOWNLOADER_MIDDLEWARES = {
    'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': None,
    'scrapy_user_agents.middlewares.RandomUserAgentMiddleware': 400,
}

User-Agent File

A default User-Agent file is included in this repository, it contains about 2200 user agent strings collected from <https://developers.whatismybrowser.com/> using <https://github.com/hyan15/crawler-demo/tree/master/crawling-basic/common_user_agents>. You can supply your own User-Agent file by set RANDOM_UA_FILE.

Configuring User-Agent type

There’s a configuration parameter RANDOM_UA_TYPE in format <device_type>.<browser_type>, default is desktop.chrome. For device_type part, only desktop, mobile, tablet are supported. For browser_type part, only chrome, firefox, safari, ie, safari are supported. If you don’t want to fix to only one browser type, you can use random to choose from all browser types.

You can set RANDOM_UA_SAME_OS_FAMILY to True to just use user agents that belong to the same os family, such as windows, mac os, linux, or android, ios, etc. Default value is True.

Usage with scrapy-proxies

To use with middlewares of random proxy such as scrapy-proxies, you need:

set RANDOM_UA_PER_PROXY to True to allow switch per proxy
set priority of RandomUserAgentMiddleware to be greater than scrapy-proxies, so that proxy is set before handle UA

Configuring Fake-UserAgent fallback

There’s a configuration parameter FAKEUSERAGENT_FALLBACK defaulting to None. You can set it to a string value, for example Mozilla or Your favorite browser, this configuration can completely disable any annoying exception.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.1

Oct 23, 2018

0.1.0

Oct 22, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapy_user_agents-0.1.1.win-amd64.zip (30.1 kB view details)

Uploaded Oct 23, 2018 Source

Built Distribution

scrapy_user_agents-0.1.1-py2.py3-none-any.whl (27.9 kB view details)

Uploaded Oct 23, 2018 Python 2Python 3

File details

Details for the file scrapy_user_agents-0.1.1.win-amd64.zip.

File metadata

Download URL: scrapy_user_agents-0.1.1.win-amd64.zip
Upload date: Oct 23, 2018
Size: 30.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.0 setuptools/38.5.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.4.3

File hashes

Hashes for scrapy_user_agents-0.1.1.win-amd64.zip
Algorithm	Hash digest
SHA256	`aa1f78c8cbae42f1a7159c5ea16c2638ac17e78d7d44111d164ed099ec48705f`
MD5	`90ceaf139d9d9bad8a082413f5696e6f`
BLAKE2b-256	`8918dcf232312662f4242439691142ef58b90c59eb8bb196b9cc86fcbd8c6c08`

See more details on using hashes here.

File details

Details for the file scrapy_user_agents-0.1.1-py2.py3-none-any.whl.

File metadata

Download URL: scrapy_user_agents-0.1.1-py2.py3-none-any.whl
Upload date: Oct 23, 2018
Size: 27.9 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.0 setuptools/38.5.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.4.3

File hashes

Hashes for scrapy_user_agents-0.1.1-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`284c9af555f3128697a2953ab3cdb987b160b091a12896562d969cf9e81d1350`
MD5	`5c34d14eb5955e76ea21c42d781c8a30`
BLAKE2b-256	`501f58a58f465f6d3c75b6cca0e470613349504b8c69f3f3963c898ebabdfa21`

See more details on using hashes here.

scrapy-user-agents 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Installation

Configuration

User-Agent File

Configuring User-Agent type

Usage with scrapy-proxies

Configuring Fake-UserAgent fallback

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes