Skip to main content
Avatar for pablohoffman from gravatar.com
Username    pablohoffman

18 projects

w3lib

Last released on

Library of web-related functions

parsel

Last released on

Parsel is a library to extract data from HTML and XML using XPath and CSS selectors

Scrapy

Last released on

A high-level Web Crawling and Web Scraping framework

scrapy-crawlera

Last released on

Crawlera middleware for Scrapy

dateparser

Last released on

Date parsing library designed to parse dates from HTML pages

splash

Last released on

A javascript rendered with a HTTP API

scrapely

Last released on

A pure-python HTML screen-scraping library

shub

Last released on

Scrapinghub Command Line Client

slybot

Last released on

Slybot crawler

scrapyd

Last released on

A service for running Scrapy spiders, with an HTTP API

frontera

Last released on

A scalable frontier for web crawlers

queuelib

Last released on

Collection of persistent (disk-based) queues

webstruct

Last released on

A library for creating statistical NER systems that work on HTML data

hubstorage

Last released on

Client interface for Scrapinghub HubStorage

scrapylib

Last released on

Scrapy helper functions and processors

adblockparser

Last released on

Parser for Adblock Plus rules

scrapy-dotpersistence

Last released on

Scrapy extension to sync `.scrapy` folder to an S3 bucket

scrapyjs

Last released on

JavaScript support for Scrapy using Splash

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page