Skip to main content

API for Scrapy spiders

Project description

=======
ScrappyServer
=======
.. image:: https://travis-ci.org/kirankoduru/arachne.svg
:target: https://travis-ci.org/kirankoduru/arachne

.. image:: https://coveralls.io/repos/kirankoduru/arachne/badge.svg?branch=master&service=github
:target: https://coveralls.io/github/kirankoduru/arachne?branch=master

ScrappyServer provides a wrapper around your scrapy ``Spider`` object to run them through a flask app. All you have to do is customize ``SPIDER_SETTINGS`` in the settings file.


Installation
============
You can install **Arachne** from pip

pip install Arachne


Sample settings
===============
This is sample settings file for spiders in your project. The settings file should be called **settings.py** for **Arachne** to find it and looks like this::

# settings.py file
SPIDER_SETTINGS = [
{
'endpoint': 'dmoz',
'location': 'spiders.DmozSpider',
'spider': 'DmozSpider'
}
]

Usage
=====
It looks very similar to a flask app but since **Scrapy** depends on the python **twisted** package, we need to run our flask app with **twisted**::

from twisted.web.wsgi import WSGIResource
from twisted.web.server import Site
from twisted.internet import reactor
from arachne import ScrappyServer

app = Arachne(__name__)

resource = WSGIResource(reactor, reactor.getThreadPool(), app)
site = Site(resource)
reactor.listenTCP(8080, site)

if __name__ == '__main__':
reactor.run()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ScrappyServer-0.0.0.1.tar.gz (10.7 kB view details)

Uploaded Source

File details

Details for the file ScrappyServer-0.0.0.1.tar.gz.

File metadata

File hashes

Hashes for ScrappyServer-0.0.0.1.tar.gz
Algorithm Hash digest
SHA256 f6a91a587bdcd9c52982992efd569638da998504ac4fef0e62269e074b822fed
MD5 71ed54b4c54e08857218d99874a5141d
BLAKE2b-256 ef050db4054cc9f27f948dfbd6634cf36e23e886c2720253dc5874505120159f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page