Use a random User-Agent provided by fake-useragent for every request
Project description
scrapy-fake-useragent-fix
=====================
Random User-Agent middleware based on
`fake-useragent <https://pypi.python.org/pypi/fake-useragent>`__. It
picks up ``User-Agent`` strings based on `usage
statistics <http://www.w3schools.com/browsers/browsers_stats.asp>`__
from a `real world database <http://useragentstring.com/>`__.
Installation
-------------
The simplest way is to install it via `pip`:
pip install scrapy-fake-useragent-fix
Configuration
-------------
Turn off the built-in ``UserAgentMiddleware`` and add
``RandomUserAgentMiddleware``.
In Scrapy >=1.0:
.. code:: python
DOWNLOADER_MIDDLEWARES = {
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware': None,
'scrapy_fake_useragent.middleware.RandomUserAgentMiddleware': 400,
}
In Scrapy <1.0:
.. code:: python
DOWNLOADER_MIDDLEWARES = {
'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': None,
'scrapy_fake_useragent.middleware.RandomUserAgentMiddleware': 400,
}
Configuring User-Agent type
---------------------------
There's a configuration parameter ``RANDOM_UA_TYPE`` defaulting to ``random`` which is passed verbatim to the fake-user-agent to random choose user agents. ``random``, ``chrome``, ``firefox``, ``safari``, ``internetexplorer`` are supported. If you want to choose from a specific device type, you can use a device prefix before browse type, such as ``desktop.chrome``, ``mobile.chrome``, only ``desktop``, ``mobile``, ``tablet`` are supported.
Usage with `scrapy-proxies`
---------------------------
To use with middlewares of random proxy such as `scrapy-proxies <https://github.com/aivarsk/scrapy-proxies>`_, you need:
1. set ``RANDOM_UA_PER_PROXY`` to True to allow switch per proxy
2. set priority of ``RandomUserAgentMiddleware`` to be greater than ``scrapy-proxies``, so that proxy is set before handle UA
.. |GitHub version| image:: https://badge.fury.io/gh/alecxe%2Fscrapy-fake-useragent.svg
:target: http://badge.fury.io/gh/alecxe%2Fscrapy-fake-useragent
.. |Requirements Status| image:: https://requires.io/github/alecxe/scrapy-fake-useragent/requirements.svg?branch=master
:target: https://requires.io/github/alecxe/scrapy-fake-useragent/requirements/?branch=master
Configuring Fake-UserAgent fallback
-----------------------------------
There's a configuration parameter ``FAKEUSERAGENT_FALLBACK`` defaulting to
``None``. You can set it to a string value, for example ``Mozilla`` or
``Your favorite browser``, this configuration can completely disable any
annoying exception.
=====================
Random User-Agent middleware based on
`fake-useragent <https://pypi.python.org/pypi/fake-useragent>`__. It
picks up ``User-Agent`` strings based on `usage
statistics <http://www.w3schools.com/browsers/browsers_stats.asp>`__
from a `real world database <http://useragentstring.com/>`__.
Installation
-------------
The simplest way is to install it via `pip`:
pip install scrapy-fake-useragent-fix
Configuration
-------------
Turn off the built-in ``UserAgentMiddleware`` and add
``RandomUserAgentMiddleware``.
In Scrapy >=1.0:
.. code:: python
DOWNLOADER_MIDDLEWARES = {
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware': None,
'scrapy_fake_useragent.middleware.RandomUserAgentMiddleware': 400,
}
In Scrapy <1.0:
.. code:: python
DOWNLOADER_MIDDLEWARES = {
'scrapy.contrib.downloadermiddleware.useragent.UserAgentMiddleware': None,
'scrapy_fake_useragent.middleware.RandomUserAgentMiddleware': 400,
}
Configuring User-Agent type
---------------------------
There's a configuration parameter ``RANDOM_UA_TYPE`` defaulting to ``random`` which is passed verbatim to the fake-user-agent to random choose user agents. ``random``, ``chrome``, ``firefox``, ``safari``, ``internetexplorer`` are supported. If you want to choose from a specific device type, you can use a device prefix before browse type, such as ``desktop.chrome``, ``mobile.chrome``, only ``desktop``, ``mobile``, ``tablet`` are supported.
Usage with `scrapy-proxies`
---------------------------
To use with middlewares of random proxy such as `scrapy-proxies <https://github.com/aivarsk/scrapy-proxies>`_, you need:
1. set ``RANDOM_UA_PER_PROXY`` to True to allow switch per proxy
2. set priority of ``RandomUserAgentMiddleware`` to be greater than ``scrapy-proxies``, so that proxy is set before handle UA
.. |GitHub version| image:: https://badge.fury.io/gh/alecxe%2Fscrapy-fake-useragent.svg
:target: http://badge.fury.io/gh/alecxe%2Fscrapy-fake-useragent
.. |Requirements Status| image:: https://requires.io/github/alecxe/scrapy-fake-useragent/requirements.svg?branch=master
:target: https://requires.io/github/alecxe/scrapy-fake-useragent/requirements/?branch=master
Configuring Fake-UserAgent fallback
-----------------------------------
There's a configuration parameter ``FAKEUSERAGENT_FALLBACK`` defaulting to
``None``. You can set it to a string value, for example ``Mozilla`` or
``Your favorite browser``, this configuration can completely disable any
annoying exception.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
File details
Details for the file scrapy_fake_useragent_fix-0.1.1-py2.py3-none-any.whl
.
File metadata
- Download URL: scrapy_fake_useragent_fix-0.1.1-py2.py3-none-any.whl
- Upload date:
- Size: 5.6 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.4.3 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.6.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5a952a3004c57145c0cb55ebcfa81542a60913eae5c34efa88f61eaaf13bc52a |
|
MD5 | edd8714fdcf8119ff36ad977d9b760d8 |
|
BLAKE2b-256 | d35595575029a529f244216dda18149ef17bd1d023c0c8a6af60b5582d339c08 |