API for Scrapy spiders
Project description
=======
ScrappyServer
=======
.. image:: https://travis-ci.org/kirankoduru/arachne.svg
:target: https://travis-ci.org/kirankoduru/arachne
.. image:: https://coveralls.io/repos/kirankoduru/arachne/badge.svg?branch=master&service=github
:target: https://coveralls.io/github/kirankoduru/arachne?branch=master
ScrappyServer provides a wrapper around your scrapy ``Spider`` object to run them through a flask app. All you have to do is customize ``SPIDER_SETTINGS`` in the settings file.
Installation
============
You can install **Arachne** from pip
pip install Arachne
Sample settings
===============
This is sample settings file for spiders in your project. The settings file should be called **settings.py** for **Arachne** to find it and looks like this::
# settings.py file
SPIDER_SETTINGS = [
{
'endpoint': 'dmoz',
'location': 'spiders.DmozSpider',
'spider': 'DmozSpider'
}
]
Usage
=====
It looks very similar to a flask app but since **Scrapy** depends on the python **twisted** package, we need to run our flask app with **twisted**::
from twisted.web.wsgi import WSGIResource
from twisted.web.server import Site
from twisted.internet import reactor
from arachne import ScrappyServer
app = Arachne(__name__)
resource = WSGIResource(reactor, reactor.getThreadPool(), app)
site = Site(resource)
reactor.listenTCP(8080, site)
if __name__ == '__main__':
reactor.run()
ScrappyServer
=======
.. image:: https://travis-ci.org/kirankoduru/arachne.svg
:target: https://travis-ci.org/kirankoduru/arachne
.. image:: https://coveralls.io/repos/kirankoduru/arachne/badge.svg?branch=master&service=github
:target: https://coveralls.io/github/kirankoduru/arachne?branch=master
ScrappyServer provides a wrapper around your scrapy ``Spider`` object to run them through a flask app. All you have to do is customize ``SPIDER_SETTINGS`` in the settings file.
Installation
============
You can install **Arachne** from pip
pip install Arachne
Sample settings
===============
This is sample settings file for spiders in your project. The settings file should be called **settings.py** for **Arachne** to find it and looks like this::
# settings.py file
SPIDER_SETTINGS = [
{
'endpoint': 'dmoz',
'location': 'spiders.DmozSpider',
'spider': 'DmozSpider'
}
]
Usage
=====
It looks very similar to a flask app but since **Scrapy** depends on the python **twisted** package, we need to run our flask app with **twisted**::
from twisted.web.wsgi import WSGIResource
from twisted.web.server import Site
from twisted.internet import reactor
from arachne import ScrappyServer
app = Arachne(__name__)
resource = WSGIResource(reactor, reactor.getThreadPool(), app)
site = Site(resource)
reactor.listenTCP(8080, site)
if __name__ == '__main__':
reactor.run()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ScrappyServer-0.0.0.1.tar.gz
(10.7 kB
view details)
File details
Details for the file ScrappyServer-0.0.0.1.tar.gz
.
File metadata
- Download URL: ScrappyServer-0.0.0.1.tar.gz
- Upload date:
- Size: 10.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f6a91a587bdcd9c52982992efd569638da998504ac4fef0e62269e074b822fed |
|
MD5 | 71ed54b4c54e08857218d99874a5141d |
|
BLAKE2b-256 | ef050db4054cc9f27f948dfbd6634cf36e23e886c2720253dc5874505120159f |