19 projects
Scrapy
A high-level Web Crawling and Web Scraping framework
shub
Scrapinghub Command Line Client
scrapinghub
Client interface for Scrapinghub API
dateparser
Date parsing library designed to parse dates from HTML pages
w3lib
Library of web-related functions
queuelib
Collection of persistent (disk-based) and non-persistent (memory-based) queues
scrapyd
A service for running Scrapy spiders, with an HTTP API
scrapy-crawlera
Crawlera middleware for Scrapy
splash
A javascript rendered with a HTTP API
scrapely
A pure-python HTML screen-scraping library
slybot
Slybot crawler
webstruct
A library for creating statistical NER systems that work on HTML data
hubstorage
Client interface for Scrapinghub HubStorage
scrapylib
Scrapy helper functions and processors
adblockparser
Parser for Adblock Plus rules
scrapy-dotpersistence
Scrapy extension to sync `.scrapy` folder to an S3 bucket
scrapyjs
JavaScript support for Scrapy using Splash
flatson
Tool to flatten stream of JSON-like objects, configured via schema
crawl-frontier
A flexible frontier for web crawlers