Asynchronous HTTP API for Running Scrapy Spiders
Project description
ScrapyRTA
ScrapyRTA is an asynchronous HTTP API for running Scrapy spiders, built with FastAPI. It's a modern rewrite of the legacy ScrapyRT project, focusing on asynchronous operation and scalability.
Features
- Run Scrapy spiders via an Async HTTP API
- Configurable request parameters
API Endpoints
POST /crawl.json
Run a spider with specified parameters.
Example request:
curl --data '{"request": {"url": "http://quotes.toscrape.com/page/2/"}, "spider_name": "toscrape-css", "crawl_args": {"zipcode":"14000"}}' http://localhost:9080/crawl -v
Have a look at
http://127.0.0.1:9080/docsfor more details and examples.
You can also create an .env file with the following content to alter ScrapyRTA behavior:
SCRAPYRTA_DEBUG=False
SCRAPYRTA_LOG_LEVEL=INFO
SCRAPYRTA_ENABLE_OPEN_API=False
SCRAPYRTA_TIMEOUT_LIMIT=30 # seconds
Notes
- Requires
scrapy.cfgin project directory, raises error if missing.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scrapyrta-1.0.3.tar.gz.
File metadata
- Download URL: scrapyrta-1.0.3.tar.gz
- Upload date:
- Size: 101.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bc18bc11054e4772eb2cbd9de3ba04231d9829d13463fab36f87fc84051ce8d8
|
|
| MD5 |
c89b3771454a2892f7969773abc87d5c
|
|
| BLAKE2b-256 |
452d1797edc7dfdec1b14bcfc8fb6b7163bf3d852c16e4a41b3a570783cc86ab
|
File details
Details for the file scrapyrta-1.0.3-py3-none-any.whl.
File metadata
- Download URL: scrapyrta-1.0.3-py3-none-any.whl
- Upload date:
- Size: 12.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1c73f6bb48508f9ea299e3a74fc629be5dccda3d54f00184ca7348ce44c45722
|
|
| MD5 |
6b7b4f991331a324b9b701e5f9ce2615
|
|
| BLAKE2b-256 |
7769406f5b4e85b7f36f664c2ecfec1b3e08a86fed7eca940b2360f859f46ac4
|