A middleware to change proxy rotated for Scrapy
Project description
======
Scrapy-Rotated-Proxy
======
.. image:: https://img.shields.io/pypi/v/scrapy-rotated-proxy.svg
:target: https://pypi.python.org/pypi/scrapy-rotated-proxy
:alt: PyPI Version
.. image:: https://img.shields.io/travis/xiaowangwindow/scrapy-rotated-proxy/master.svg
:target: http://travis-ci.org/xiaowangwindow/scrapy-rotated-proxy
:alt: Build Status
Overview
========
Scrapy-Rotated-Proxy is a middleware to dynamically configure Request proxy for Scrapy.
It can used when you have multi proxy ip, and need to attach rotated proxy to each Request.
Scrapy-Rotated-Proxy support multi backend storage, you can provide proxy ip
list through Spider Settings, File or MongoDB.
Requirements
============
* Python 2.7 or Python 3.3+
* Works on Linux, Windows, Mac OSX, BSD
Install
=======
The quick way::
pip install scrapy-rotated-proxy
OR copy this middleware to your scrapy project.
Documentation
=============
In settings.py, for example::
# -----------------------------------------------------------------------------
# ROTATED PROXY SETTINGS (Spider Settings Backend)
# -----------------------------------------------------------------------------
DOWNLOADER_MIDDLEWARES.update({
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': None,
'scrapy_rotated_proxy.downloadmiddlwares.proxy.RotatedProxy': 750,
})
ROTATED_PROXY_ENABLED = True
PROXY_STORAGE = 'scrapy_rotated_proxy.extensions.file_storage.FileProxyStorage'
# When set PROXY_FILE_PATH='', scrapy-rotated-proxy
# will use proxy in Spider Settings default.
PROXY_FILE_PATH = ''
HTTP_PROXIES = [
'http://proxy0:8888',
'http://user:pass@proxy1:8888',
'https://user:pass@proxy1:8888',
]
HTTPS_PROXIES = [
'http://proxy0:8888',
'http://user:pass@proxy1:8888',
'https://user:pass@proxy1:8888',
]
# -----------------------------------------------------------------------------
# ROTATED PROXY SETTINGS (Local File Backend)
# -----------------------------------------------------------------------------
DOWNLOADER_MIDDLEWARES.update({
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': None,
'scrapy_rotated_proxy.downloadmiddlwares.proxy.RotatedProxy': 750,
})
ROTATED_PROXY_ENABLED = True
PROXY_STORAGE = 'scrapy_rotated_proxy.extensions.file_storage.FileProxyStorage'
PROXY_FILE_PATH = 'file_path/proxy.txt'
# proxy file content, must conform to json format, otherwise will cause json
# load error
HTTP_PROXIES = [
'http://proxy0:8888',
'http://user:pass@proxy1:8888',
'https://user:pass@proxy1:8888'
]
HTTPS_PROXIES = [
'http://proxy0:8888',
'http://user:pass@proxy1:8888',
'https://user:pass@proxy1:8888'
]
# -----------------------------------------------------------------------------
# ROTATED PROXY SETTINGS (MongoDB Backend)
# -----------------------------------------------------------------------------
# mongodb document required field: scheme, ip, port, username, password
# document example: {'scheme': 'http', 'ip': '10.0.0.1', 'port': 8080,
# 'username':'user', 'password':'password'}
DOWNLOADER_MIDDLEWARES.update({
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': None,
'scrapy_rotated_proxy.downloadmiddlwares.proxy.RotatedProxy': 750,
})
ROTATED_PROXY_ENABLED = True
PROXY_STORAGE = 'scrapy_rotated_proxy.extensions.mongodb_storage.MongoDBProxyStorage'
PROXY_MONGODB_STORAGE_URI = 'mongodb://10.255.0.0:27017'
PROXY_MONGODB_STORAGE_DB = 'vps_management'
PROXY_MONGODB_STORAGE_COLL = 'service'
Scrapy-Rotated-Proxy
======
.. image:: https://img.shields.io/pypi/v/scrapy-rotated-proxy.svg
:target: https://pypi.python.org/pypi/scrapy-rotated-proxy
:alt: PyPI Version
.. image:: https://img.shields.io/travis/xiaowangwindow/scrapy-rotated-proxy/master.svg
:target: http://travis-ci.org/xiaowangwindow/scrapy-rotated-proxy
:alt: Build Status
Overview
========
Scrapy-Rotated-Proxy is a middleware to dynamically configure Request proxy for Scrapy.
It can used when you have multi proxy ip, and need to attach rotated proxy to each Request.
Scrapy-Rotated-Proxy support multi backend storage, you can provide proxy ip
list through Spider Settings, File or MongoDB.
Requirements
============
* Python 2.7 or Python 3.3+
* Works on Linux, Windows, Mac OSX, BSD
Install
=======
The quick way::
pip install scrapy-rotated-proxy
OR copy this middleware to your scrapy project.
Documentation
=============
In settings.py, for example::
# -----------------------------------------------------------------------------
# ROTATED PROXY SETTINGS (Spider Settings Backend)
# -----------------------------------------------------------------------------
DOWNLOADER_MIDDLEWARES.update({
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': None,
'scrapy_rotated_proxy.downloadmiddlwares.proxy.RotatedProxy': 750,
})
ROTATED_PROXY_ENABLED = True
PROXY_STORAGE = 'scrapy_rotated_proxy.extensions.file_storage.FileProxyStorage'
# When set PROXY_FILE_PATH='', scrapy-rotated-proxy
# will use proxy in Spider Settings default.
PROXY_FILE_PATH = ''
HTTP_PROXIES = [
'http://proxy0:8888',
'http://user:pass@proxy1:8888',
'https://user:pass@proxy1:8888',
]
HTTPS_PROXIES = [
'http://proxy0:8888',
'http://user:pass@proxy1:8888',
'https://user:pass@proxy1:8888',
]
# -----------------------------------------------------------------------------
# ROTATED PROXY SETTINGS (Local File Backend)
# -----------------------------------------------------------------------------
DOWNLOADER_MIDDLEWARES.update({
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': None,
'scrapy_rotated_proxy.downloadmiddlwares.proxy.RotatedProxy': 750,
})
ROTATED_PROXY_ENABLED = True
PROXY_STORAGE = 'scrapy_rotated_proxy.extensions.file_storage.FileProxyStorage'
PROXY_FILE_PATH = 'file_path/proxy.txt'
# proxy file content, must conform to json format, otherwise will cause json
# load error
HTTP_PROXIES = [
'http://proxy0:8888',
'http://user:pass@proxy1:8888',
'https://user:pass@proxy1:8888'
]
HTTPS_PROXIES = [
'http://proxy0:8888',
'http://user:pass@proxy1:8888',
'https://user:pass@proxy1:8888'
]
# -----------------------------------------------------------------------------
# ROTATED PROXY SETTINGS (MongoDB Backend)
# -----------------------------------------------------------------------------
# mongodb document required field: scheme, ip, port, username, password
# document example: {'scheme': 'http', 'ip': '10.0.0.1', 'port': 8080,
# 'username':'user', 'password':'password'}
DOWNLOADER_MIDDLEWARES.update({
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': None,
'scrapy_rotated_proxy.downloadmiddlwares.proxy.RotatedProxy': 750,
})
ROTATED_PROXY_ENABLED = True
PROXY_STORAGE = 'scrapy_rotated_proxy.extensions.mongodb_storage.MongoDBProxyStorage'
PROXY_MONGODB_STORAGE_URI = 'mongodb://10.255.0.0:27017'
PROXY_MONGODB_STORAGE_DB = 'vps_management'
PROXY_MONGODB_STORAGE_COLL = 'service'
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Close
Hashes for scrapy-rotated-proxy-0.0.3.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9ab7b183d200293fd3c335b94ac9e0e4335da8dea9994b63023fbde5001f2417 |
|
MD5 | 99c3e97a5f2abd5e4c42c5bc59673c43 |
|
BLAKE2b-256 | f672118e8f8681ab953b9ed63b27e9ea50c88deea09bb6cf713fe18ea94cf582 |