Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

A daemon for scheduling Scrapy spiders

Project description PyPI Version

Scrapy Do is a daemon that provides a convenient way to run Scrapy spiders. It can either do it once - immediately; or it can run them periodically, at specified time intervals. It’s been inspired by scrapyd but written from scratch. It comes with a REST API, a command line client, and an interactive web interface.

Quick Start

  • Install scrapy-do using pip:

    $ pip install scrapy-do
  • Start the daemon in the foreground:

    $ scrapy-do -n scrapy-do
  • Open another terminal window, download the Scrapy’s Quotesbot example, and push the code to the server:

    $ git clone
    $ cd quotesbot
    $ scrapy-do-cl push-project
    | quotesbot      |
    | toscrape-css   |
    | toscrape-xpath |
  • Schedule some jobs:

    $ scrapy-do-cl schedule-job --project quotesbot \
        --spider toscrape-css --when 'every 5 to 15 minutes'
    | identifier                           |
    | 0a3db618-d8e1-48dc-a557-4e8d705d599c |
    $ scrapy-do-cl schedule-job --project quotesbot --spider toscrape-css
    | identifier                           |
    | b3a61347-92ef-4095-bb68-0702270a52b8 |
  • See what’s going on:

    Active Jobs

    The web interface is available at http://localhost:7654 by default.

Building from source

Both of the steps below require nodejs to be installed.

  • Check if things work fine:

    $ pip install -rrequirements-dev.txt
    $ tox
  • Build the wheel:

    $ python bdist_wheel

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for scrapy-do, version 0.3.2
Filename, size File type Python version Upload date Hashes
Filename, size scrapy_do-0.3.2-py3-none-any.whl (797.0 kB) File type Wheel Python version py3 Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page