Skip to main content

Scrapy pipeline & extensions for AP Cloudy (logs, stats, requests, items)

Project description

Apcloudy-Pipeline

apcloudy-pipeline is a Scrapy integration that sends items, requests, logs, and spider statistics to the Your backend using secure HMAC-based authentication.


Features

  • 📦 Item forwarding using Scrapy Item Pipeline
  • 🌐 Request and response logging
  • 📊 Spider statistics reporting
  • 🧾 Spider, user, and Scrapy internal log forwarding
  • 🔐 HMAC-secured API communication

Installation

pip install apcloudy-pipeline

Configuration

Add the following settings to your Scrapy project's settings.py file:

APCLOUDY_API_URL = "http://localhost:8000/api/v1/"
APCLOUDY_API_KEY = "api_test_1234567890"
APCLOUDY_SECRET_KEY = "secret_test_1234567890"
JOB_ID = 123

Item Pipeline (Required)

The item pipeline is required to send scraped items to the backend.

ITEM_PIPELINES = {
    "apcloudy_pipeline.pipelines.APCloudyItemPipeline": 300,
}

Extensions (Optional)

Enable the following extensions if you want to send requests, logs, and spider statistics.

EXTENSIONS = {
    "apcloudy_pipeline.request_logger.APCloudyRequestLogger": 400,
    "apcloudy_pipeline.extensions.APCloudyLoggingExtension": 510,
    "apcloudy_pipeline.extensions.APCloudyStatsExtension": 520,
}

Extensions Overview

  • APCloudyRequestLogger Captures request and response metadata such as URL, HTTP method, status code, timing, and fingerprint.

  • APCloudyLoggingExtension Sends spider logs, user logs, Scrapy internal logs, and exception tracebacks to the backend.

  • APCloudyStatsExtension Sends final spider statistics when the crawl finishes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

apcloudy_pipeline-0.1.1.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

apcloudy_pipeline-0.1.1-py3-none-any.whl (8.6 kB view details)

Uploaded Python 3

File details

Details for the file apcloudy_pipeline-0.1.1.tar.gz.

File metadata

  • Download URL: apcloudy_pipeline-0.1.1.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for apcloudy_pipeline-0.1.1.tar.gz
Algorithm Hash digest
SHA256 9589d22789e027313140fef30f92e4ddf84acedec99fb17aa08a4faa16bad990
MD5 fdb59df9d1adf560207aaab34a1b55d7
BLAKE2b-256 1c25a5102ac12720205bb3df53e52c7854849682798c9c4379139988402db331

See more details on using hashes here.

File details

Details for the file apcloudy_pipeline-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for apcloudy_pipeline-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 78e8bcb67741191f3258e00a02fee2330bcf86be1293771e5765ca2b939a5c87
MD5 2120d5b45b7516c062c420b8e5d735af
BLAKE2b-256 11fe375b222105c7290762a0d9ad60b7d663479513bb02a1cb7c34e3de870fe9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page