Skip to main content

Asynchronous HTTP API for Running Scrapy Spiders

Project description

ScrapyRTA

ScrapyRTA is an asynchronous HTTP API for running Scrapy spiders, built with FastAPI. It's a modern rewrite of the legacy ScrapyRT project, focusing on asynchronous operation and scalability.

Features

  • Run Scrapy spiders via an Async HTTP API
  • Configurable request parameters

API Endpoints

POST /crawl.json

Run a spider with specified parameters.

Example request:

curl --data '{"request": {"url": "http://quotes.toscrape.com/page/2/"}, "spider_name": "toscrape-css", "crawl_args": {"zipcode":"14000"}}' http://localhost:9080/crawl -v

Have a look at http://127.0.0.1:9080/docs for more details and examples.

You can also create an .env file with the following content to alter ScrapyRTA behavior:

SCRAPYRTA_DEBUG=False
SCRAPYRTA_LOG_LEVEL=INFO
SCRAPYRTA_ENABLE_OPEN_API=False

SCRAPYRTA_TIMEOUT_LIMIT=30 # seconds

Notes

  • Requires scrapy.cfg in project directory, raises error if missing.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapyrta-1.0.3.tar.gz (101.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scrapyrta-1.0.3-py3-none-any.whl (12.7 kB view details)

Uploaded Python 3

File details

Details for the file scrapyrta-1.0.3.tar.gz.

File metadata

  • Download URL: scrapyrta-1.0.3.tar.gz
  • Upload date:
  • Size: 101.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for scrapyrta-1.0.3.tar.gz
Algorithm Hash digest
SHA256 bc18bc11054e4772eb2cbd9de3ba04231d9829d13463fab36f87fc84051ce8d8
MD5 c89b3771454a2892f7969773abc87d5c
BLAKE2b-256 452d1797edc7dfdec1b14bcfc8fb6b7163bf3d852c16e4a41b3a570783cc86ab

See more details on using hashes here.

File details

Details for the file scrapyrta-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: scrapyrta-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 12.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for scrapyrta-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1c73f6bb48508f9ea299e3a74fc629be5dccda3d54f00184ca7348ce44c45722
MD5 6b7b4f991331a324b9b701e5f9ce2615
BLAKE2b-256 7769406f5b4e85b7f36f664c2ecfec1b3e08a86fed7eca940b2360f859f46ac4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page