Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

Utilities for running scrapy on heroku

Project description

A package to assist with running scrapy on heroku. This is accomplished by providing a custom application configuration at scrapy_heroku.app.application that launches the scrapyd web service using the PORT environment variable and a multi-process work queue implemented on a Postgres database specified by the DATABASE_URL environment variable.

Configuration

Create a git repo that has a scrapy project at the root (scrapy.cfg should be at the top level). Edit your scrapy.cfg to include the following:

[scrapyd]
application = scrapy_heroku.app.application

[deploy]
url = http://<YOUR_HEROKU_APP_NAME>.herokuapp.com:80/
project = <YOUR_PROJECT_NAME>
username = <A_USER_NAME>
password = <A_PASSWORD>

Add a requirements.txt file that includes scrapy-heroku in it. It is strongly recommended that you version pin scrapy-heroku as well as the version of scrapy that your project is developed against (pip freeze > requirements.txt). Finally create a Procfile that consists of:

web: scrapy server

Make sure you have a postgres database that has been promoted to DATABASE_URL

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for scrapy-heroku, version 0.7.1
Filename, size File type Python version Upload date Hashes
Filename, size scrapy-heroku-0.7.1.tar.gz (5.2 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page