Skip to main content

Generic classes to deal with data scraping using Scrapy

Project description

Scrapy is a fantastic tool to deal with data scraping. Although, for someone who doesn’t work frequently with the framework, it might be hard to learn how to build some patterns which are common in scraping activities such as: “login”, “search”, “pagination”, etc. Even more, it’s hard to find some features like database pipelines, advanced middlewares and commands to run scripts.

Scrapy-venom comes to fill the lack of libraries about these activities. It brings a new convention for the implementation of spiders, a more “dry” (Don’t Repeat Yourself) and feature based way to program.

Venom classes are intended to solve simple scraping problems, once at time. It comes with a series of featured mixins which we call “steps”. A set of “steps” build the spider “flow”. They make the scraping programming more easy to read and understand.

The documentation is available at:

Project details

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page