Skip to main content

API for Scrapy spiders

Project description

=======
ScrappyServer
=======
.. image:: https://travis-ci.org/kirankoduru/arachne.svg
:target: https://travis-ci.org/kirankoduru/arachne

.. image:: https://coveralls.io/repos/kirankoduru/arachne/badge.svg?branch=master&service=github
:target: https://coveralls.io/github/kirankoduru/arachne?branch=master

ScrappyServer provides a wrapper around your scrapy ``Spider`` object to run them through a flask app. All you have to do is customize ``SPIDER_SETTINGS`` in the settings file.


Installation
============
You can install **Arachne** from pip

pip install Arachne


Sample settings
===============
This is sample settings file for spiders in your project. The settings file should be called **settings.py** for **Arachne** to find it and looks like this::

# settings.py file
SPIDER_SETTINGS = [
{
'endpoint': 'dmoz',
'location': 'spiders.DmozSpider',
'spider': 'DmozSpider'
}
]

Usage
=====
It looks very similar to a flask app but since **Scrapy** depends on the python **twisted** package, we need to run our flask app with **twisted**::

from twisted.web.wsgi import WSGIResource
from twisted.web.server import Site
from twisted.internet import reactor
from arachne import ScrappyServer

app = Arachne(__name__)

resource = WSGIResource(reactor, reactor.getThreadPool(), app)
site = Site(resource)
reactor.listenTCP(8080, site)

if __name__ == '__main__':
reactor.run()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ScrappyServer-0.0.0.1.tar.gz (10.7 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page