Skip to main content

Statsd integration middleware for scrapy

Project description

Usage

pip install scrapy-statsd-middleware

DOWNLOADER_MIDDLEWARES = {
  'statsd_middleware.StatsdMiddleware': 543,
}

SPIDER_MIDDLEWARES = {
  'statsd_middleware.StatsdMiddleware': 543,
}

There’s also a few settings that you can use:

  • STATSD_HOSTNAME - Defaults to the current machine’s hostname

  • STATSD_PREFIX - Defaults to “hostname.spider-name.”

  • STATSD_HOST_IP - Defaults to “0.0.0.0”

This will increment statsd with the following: * requests (spider_reqs_issued) * response (spider_resps_received) * errors (error_KeyError, where KeyError is whatever the error name is) * items processed (processed_Product, where Product is whatever the item class name is)

Example Implementation

An example implementation of this middleware is in /example It includes a docker-compose file that describes how to use this middleware with statsd & graphite

Example Installation & Usage

  • Build the docker images docker-compose build

  • Start the statsd container docker-compose up -d

  • Run the example spider: docker-compose -f ./example/docker-compose.yml run spider bash -c “cd ./opt/scrapy/dirbot/ && scrapy crawl dmoz”

You can see a live graphite dashboard at http://0.0.0.0/dashboard You should see stats show up under something like “stats.Z-MacBook-Pro.local.dmoz.spider_reqs_issued”

Development

You can run the tests via make test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapy-statsd-middleware-0.0.8.tar.gz (3.9 kB view details)

Uploaded Source

Built Distribution

scrapy_statsd_middleware-0.0.8-py2.py3-none-any.whl (5.3 kB view details)

Uploaded Python 2Python 3

File details

Details for the file scrapy-statsd-middleware-0.0.8.tar.gz.

File metadata

File hashes

Hashes for scrapy-statsd-middleware-0.0.8.tar.gz
Algorithm Hash digest
SHA256 499854f9806caab56877f2e554e6588e70b1f747261ef0fe4c8623cc276f971a
MD5 2415ab374943caa911cd173b18e3fc04
BLAKE2b-256 4443fe8bf448512933844d2d21403a2d044503834e907054d3b21aac9e07e6c6

See more details on using hashes here.

File details

Details for the file scrapy_statsd_middleware-0.0.8-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for scrapy_statsd_middleware-0.0.8-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 eead9a9c0e66464f7019dea8d574933fe7d45222761ae63fd54965b622a5d92a
MD5 6e8a4085c6bba48065a94bb9ce41def7
BLAKE2b-256 01be8c852e01ecaf12c75af8d83c45308422f06af415cd1fd2501e0c31c23348

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page