Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

A collection of helpers for running Scrapy in ScraperWiki

Project description

A collection of helpers for running scrapers built with Scrapy in ScraperWiki

Launch scraper without scrapy CLI


from scrapy.conf import settings
from scrapyrwiki import run_spider

def main():
    run_spider(MySpider(), settings)

if __name__ == '__main__':

Save produced data to ScraperWiki

Just add “scrapyrwiki.pipelines.ScraperWikiPipeline” to ITEM_PIPELINES


from scrapy.conf import settings
from scrapyrwiki import run_spider

def scraperwiki():
    options = {
        'SW_SAVE_BUFFER': 5,
        'SW_UNIQUE_KEYS': {"MyItem": ['url']},
        'ITEM_PIPELINES': ['scrapyrwiki.pipelines.ScraperWikiPipeline'],
    run_spider(MySpider(), settings)

if __name__ == 'scraper':

Check spider contracts in CI

Just launch spider with run_tests


from scrapyrwiki import run_tests
from scrapy.conf import settings

run_tests(MySpider(), "output.xml", settings)

Note: For testing the HTTP cache is used. In the directory where the script is launched there must be a scrapy.cfg (needed by Scrapy to identify that’s a scraper directory) and a .scrapy directory with the HTTP cache db.

The output is in XUnit format, tested on Jenkins

Log scraper errors to Sentry

Install scrapy-sentry and set the environment variable SENTRY_DSN with the Sentry key. Scrapyrwiki will handle everything for you.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for scrapyrwiki, version 0.2
Filename, size File type Python version Upload date Hashes
Filename, size scrapyrwiki-0.2.tar.gz (3.5 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page