Skip to main content

Asynchronous HTTP API for Running Scrapy Spiders

Project description

ScrapyRTA

PyPI Downloads

ScrapyRTA is an asynchronous HTTP API for running Scrapy spiders, built with FastAPI. It's a modern rewrite of the legacy ScrapyRT project, focusing on asynchronous operation and scalability.

Features

  • Run Scrapy spiders via an Async HTTP API
  • Configurable request parameters

API Endpoints

POST /crawl.json

Run a spider with specified parameters.

Example request:

curl --data '{"request": {"url": "http://quotes.toscrape.com/page/2/"}, "spider_name": "toscrape-css", "crawl_args": {"zipcode":"14000"}}' http://localhost:9080/crawl -v

Have a look at http://127.0.0.1:9080/docs for more details and examples.

You can also create an .env file with the following content to alter ScrapyRTA behavior:

SCRAPYRTA_DEBUG=False
SCRAPYRTA_LOG_LEVEL=INFO
SCRAPYRTA_ENABLE_OPEN_API=False

SCRAPYRTA_TIMEOUT_LIMIT=30 # seconds

Notes

  • Requires scrapy.cfg in project directory, raises error if missing.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapyrta-1.0.5.tar.gz (101.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scrapyrta-1.0.5-py3-none-any.whl (13.6 kB view details)

Uploaded Python 3

File details

Details for the file scrapyrta-1.0.5.tar.gz.

File metadata

  • Download URL: scrapyrta-1.0.5.tar.gz
  • Upload date:
  • Size: 101.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for scrapyrta-1.0.5.tar.gz
Algorithm Hash digest
SHA256 c65e4161cf4a706c4b62dc67e238497656923435d0570d3e5a9c71a4d61e17b5
MD5 0a2e5b3fdf7bf5fbfb535a98f3528c92
BLAKE2b-256 73688655c7b135f1f82ecf709e27d0a7233ca75ec8213264263b82633ef9b55a

See more details on using hashes here.

File details

Details for the file scrapyrta-1.0.5-py3-none-any.whl.

File metadata

  • Download URL: scrapyrta-1.0.5-py3-none-any.whl
  • Upload date:
  • Size: 13.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for scrapyrta-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 a0543447b73b3bce3a3fcd698bd600ea7af3fd93879f5ec946273d94829c754e
MD5 9d0622002a5e04fa80f8ab2948c6d170
BLAKE2b-256 e9f5e54e88df05a65e817891dc3b3f4b858354594b70490cc14c4e2d01c1314c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page