Skip to main content
Help us improve Python packaging – donate today!

Utilities for running scrapy on heroku

Project Description

A package to assist with running scrapy on heroku. This is accomplished by providing a custom application configuration at scrapy_heroku.app.application that launches the scrapyd web service using the PORT environment variable and a multi-process work queue implemented on a Postgres database specified by the DATABASE_URL environment variable.

Configuration

Create a git repo that has a scrapy project at the root (scrapy.cfg should be at the top level). Edit your scrapy.cfg to include the following:

[scrapyd]
application = scrapy_heroku.app.application

[deploy]
url = http://<YOUR_HEROKU_APP_NAME>.herokuapp.com:80/
project = <YOUR_PROJECT_NAME>
username = <A_USER_NAME>
password = <A_PASSWORD>

Add a requirements.txt file that includes scrapy-heroku in it. It is strongly recommended that you version pin scrapy-heroku as well as the version of scrapy that your project is developed against (pip freeze > requirements.txt). Finally create a Procfile that consists of:

web: scrapy server

Make sure you have a postgres database that has been promoted to DATABASE_URL

Release history Release notifications

This version
History Node

0.7.1

History Node

0.7

History Node

0.6

History Node

0.5

History Node

0.4

History Node

0.3

History Node

0.2

History Node

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
scrapy-heroku-0.7.1.tar.gz (5.2 kB) Copy SHA256 hash SHA256 Source None Nov 20, 2012

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page