Skip to main content
Python Software Foundation 20th Year Anniversary Fundraiser  Donate today!
Avatar for scrapy from gravatar.com
Username    scrapy

29 projects

scrapy-zyte-smartproxy

Last released

Scrapy middleware for Zyte Smart Proxy Manager

queuelib

Last released

Collection of persistent (disk-based) and non-persistent (memory-based) queues

Scrapy

Last released

A high-level Web Crawling and Web Scraping framework

scrapy-poet

Last released

Page Object pattern for Scrapy

andi

Last released

Library for annotation-based dependency injection

itemloaders

Last released

Base library for scrapy's ItemLoader

itemadapter

Last released

Common interface for data container classes

web-poet

Last released

Scrapinghub's Page Object pattern for web scraping

splash

Last released

A javascript rendered with a HTTP API

w3lib

Last released

Library of web-related functions

parsel

Last released

Parsel is a library to extract data from HTML and XML using XPath and CSS selectors

Protego

Last released

Pure-Python robots.txt parser with support for modern conventions

scrapely

Last released

A pure-python HTML screen-scraping library

scrapy-po

Last released

Page Object pattern for Scrapy

cssselect

Last released

cssselect parses CSS3 Selectors and translates them to XPath 1.0

scrapyd

Last released

A service for running Scrapy spiders, with an HTTP API

webstruct

Last released

A library for creating statistical NER systems that work on HTML data

scrapyd-client

Last released

A client for scrapyd

PyPyDispatcher

Last released

Multi-producer-multi-consumer signal dispatching mechanism

scrapy-deltafetch

Last released

Scrapy middleware to ignore previously crawled pages

adblockparser

Last released

Parser for Adblock Plus rules

loginform

Last released

Fill HTML login forms automatically

scrapy-splitvariants

Last released

Scrapy spider middleware to split an item into multiple items on a multi-valued key

scrapy-hcf

Last released

Scrapy spider middleware to use Scrapinghub's Hub Crawl Frontier as a backend for URLs

scrapy-querycleaner

Last released

Scrapy spider middleware to clean up query parameters in request URLs

scrapy-magicfields

Last released

Scrapy middleware to add extra "magic" fields to items

scrapy-djangoitem

Last released

Scrapy extension to write scraped items using Django models

scrapyjs

Last released

JavaScript support for Scrapy using Splash

scrapy-jsonrpc

Last released

Scrapy extenstion to control spiders using JSON-RPC

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page