Skip to main content

No project description provided

Project description

spider-brew-kit

A library for scrapy tools, including but not limited to the usual pipelines, middlewares, etc.

install

pip install spider-brew-kit

usage

pipelines

mongo pipeline

A pipeline saved into MongoDB asynchronously with txmongo

use database db.createUser( { user: "username", pwd: "password", roles: [ { role: "readWrite", db: "database" } ] } )

how use:

  1. add to settings.py
ITEM_PIPELINES = {
    'scrapy_kit.pipelines.MongoPipeline': 300,
}
  1. add mongo config to settings.py
MONGO_URI = "mongodb://username:password@host:port"
MONGO_DATABASE_NAME = "database"
MONGO_COLLECTION_NAME = "collection"

middlewares

proxy connection close middleware

Proxy close connection multiplexing middleware

Tunnel Proxy Dynamic Edition request found that the number of requests in the Personal Centre Tunnel Proxy Usage Statistics is very small, which is seriously inconsistent with the real number of requests. Moreover, there is no IP change when using Tunnel Broker Dynamic Edition. The reason for this is that the tunnel sends requests that reuse previously established connections. You need to add Connection: close to the header.

How to use it:

  1. Add in settings.py:
DOWNLOADER_MIDDLEWARES = {
    'scrapy_kit.middlewares.ProxyConnectionCloseMiddleware': 543,
}

development

git clone
cd spider-brew-kit
poetry install

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spider_brew_kit-0.1.6.tar.gz (12.2 kB view hashes)

Uploaded Source

Built Distribution

spider_brew_kit-0.1.6-py3-none-any.whl (14.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page