9 projects
scrapy-splash
JavaScript support for Scrapy using Splash
scrapy-pagestorage
Scrapy extension to store info in storage service
scrapy-deltafetch
Scrapy middleware to ignore previously crawled pages
scrapy-jsonschema
Scrapy schema validation pipeline and Item builder using JSON Schema
scrapy-dotpersistence
Scrapy extension to sync `.scrapy` folder to an S3 bucket
scrapy-splitvariants
Scrapy spider middleware to split an item into multiple items on a multi-valued key
scrapy-hcf
Scrapy spider middleware to use Scrapinghub's Hub Crawl Frontier as a backend for URLs
scrapy-querycleaner
Scrapy spider middleware to clean up query parameters in request URLs
scrapy-magicfields
Scrapy middleware to add extra "magic" fields to items