Skip to main content

Real-time metrics exporter for Scrapy with automatic collection of all stats

Project description

Scrapy Metrics Exporter

Export Scrapy metrics in real-time with automatic collection of all Scrapy stats and custom metrics.

Current backend: Prometheus (OpenMetrics format)

Installation

pip install scrapy-metrics-exporter

Quick Start

Add to your Scrapy project's settings.py:

EXTENSIONS = {
    'scrapy_metrics_exporter.extension.MetricsExporter': 500,
}

That's it! Run your spider and metrics will be available at:

http://localhost:9410/metrics

Configuration

Optional settings (with defaults):

# Port for metrics server
METRICS_EXPORTER_PORT = 9410

# Host to bind to
METRICS_EXPORTER_HOST = '0.0.0.0'

# Stats update interval in seconds
METRICS_EXPORTER_UPDATE_INTERVAL = 30

Features

Automatic Metrics Collection

  • Exports all numeric Scrapy stats automatically
  • No manual configuration needed for new metrics
  • Custom stats created in spiders/pipelines are exported automatically

📊 Built-in Real-time Metrics

  • Items scraped/dropped (Counters with spider label)
  • HTTP requests by method (Counters with spider + method labels)
  • HTTP responses by status (Counters with spider + status labels)
  • Pending/active requests (Gauges)

🔧 Flexible & Extensible

  • Works with any Scrapy middleware or pipeline
  • Support for custom metrics via Scrapy stats API
  • Easy to add Histograms/Summaries when needed

Available Metrics

Real-time Counters (via signals)

  • scrapy_items_scraped_total{spider="..."} - Total items scraped
  • scrapy_items_dropped_total{spider="..."} - Total items dropped
  • scrapy_requests_total{spider="...", method="..."} - Total requests by method
  • scrapy_responses_total{spider="...", status="..."} - Total responses by status

State Gauges

  • scrapy_requests_pending{spider="..."} - Current pending requests
  • scrapy_requests_active{spider="..."} - Current active requests

Automatic Scrapy Stats (examples)

  • scrapy_memusage_max{spider="..."} - Maximum memory usage
  • scrapy_downloader_request_bytes{spider="..."} - Total bytes downloaded
  • scrapy_item_scraped_count{spider="..."} - Item count from Scrapy stats
  • And all other numeric Scrapy stats automatically!

Custom Metrics

Any numeric stat you create in your spider or pipeline is automatically exported:

# In your spider or pipeline
self.crawler.stats.inc_value('custom/my_metric')
self.crawler.stats.set_value('custom/avg_price', 42.5)

Appears automatically as:

scrapy_custom_my_metric{spider="..."} 10.0
scrapy_custom_avg_price{spider="..."} 42.5

Query Examples (Prometheus)

# Total items scraped across all spiders
sum(scrapy_items_scraped_total)

# Items scraped by specific spider
scrapy_items_scraped_total{spider="my_spider"}

# Request rate per second
rate(scrapy_requests_total[5m])

# Top spiders by item count
topk(5, scrapy_items_scraped_total)

Future Backends

The exporter is designed to be backend-agnostic. While it currently uses Prometheus (OpenMetrics format), the architecture allows for easy addition of other backends:

  • Prometheus (current) - OpenMetrics format via /metrics endpoint
  • StatsD (planned) - Send metrics to StatsD/Graphite
  • InfluxDB (planned) - Direct InfluxDB integration
  • OpenTelemetry (planned) - OTLP protocol support
  • Custom - Easy to implement your own backend

Architecture

The exporter uses a modular design:

  • Collectors - Gather metrics from Scrapy (signals + stats)
  • Registry - Store and manage metrics
  • Backend - Format and expose metrics (currently Prometheus)
  • Server - HTTP server for metrics endpoint (Twisted)

This separation makes it easy to add new backends without changing the core collection logic.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapy_metrics_exporter-0.1.0.tar.gz (12.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scrapy_metrics_exporter-0.1.0-py3-none-any.whl (12.8 kB view details)

Uploaded Python 3

File details

Details for the file scrapy_metrics_exporter-0.1.0.tar.gz.

File metadata

  • Download URL: scrapy_metrics_exporter-0.1.0.tar.gz
  • Upload date:
  • Size: 12.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for scrapy_metrics_exporter-0.1.0.tar.gz
Algorithm Hash digest
SHA256 57db8f65b00e8ed8339d605b4aac4ad727febb007d3011abc0dd2f22706ff00a
MD5 38cac430e72aa6d58a387d7e7714ef04
BLAKE2b-256 207a506d5d702a0301e295e9012e8f9ab68dab1011517379fca294fc3b507c09

See more details on using hashes here.

File details

Details for the file scrapy_metrics_exporter-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for scrapy_metrics_exporter-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c1c2b74bf2d7991026a940578b4ac3a394abb32f0d6f6d661052b1c52c1b24a3
MD5 83ccdfd5583a47413e9296bc91bb919a
BLAKE2b-256 875ccd83a1c1f31bd6a9193db400b49943dd106d67c7d14e8dcc2ff83ed07c9a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page