Skip to main content

API for Scrapy spiders

Project description

https://travis-ci.org/kirankoduru/arachne.svg https://coveralls.io/repos/kirankoduru/arachne/badge.svg?branch=master&service=github

Arachne provides a wrapper around your scrapy Spider object to run them through a flask app. All you have to do is customize SPIDER_SETTINGS in the settings file.

Installation

You can install ArachneScrapy from pip

pip install ArachneScrapy

Sample settings

This is sample settings file for spiders in your project. The settings file should be called settings.py for Arachne to find it and looks like this:

# settings.py file
SPIDER_SETTINGS = [
        {
                'endpoint': 'dmoz',
                'location': 'spiders.DmozSpider',
                'spider': 'DmozSpider'
        }
]

Usage

It looks very similar to a flask app but since Scrapy depends on the python twisted package, we need to run our flask app with twisted:

from twisted.web.wsgi import WSGIResource
from twisted.web.server import Site
from twisted.internet import reactor
from arachne import Arachne

app = Arachne(__name__)

resource = WSGIResource(reactor, reactor.getThreadPool(), app)
site = Site(resource)
reactor.listenTCP(8080, site)

if __name__ == '__main__':
        reactor.run()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for ArachneScrapy, version 0.6.3
Filename, size File type Python version Upload date Hashes
Filename, size ArachneScrapy-0.6.3-py2-none-any.whl (14.0 kB) File type Wheel Python version py2 Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page