Skip to main content

API for Scrapy spiders

Project description

https://travis-ci.org/kirankoduru/arachne.svg https://coveralls.io/repos/kirankoduru/arachne/badge.svg?branch=master&service=github

Arachne provides a wrapper around your scrapy Spider object to run them through a flask app. All you have to do is customize SPIDER_SETTINGS in the settings file.

Installation

You can install ArachneScrapy from pip

pip install ArachneScrapy

Sample settings

This is sample settings file for spiders in your project. The settings file should be called settings.py for Arachne to find it and looks like this:

# settings.py file
SPIDER_SETTINGS = [
        {
                'endpoint': 'dmoz',
                'location': 'spiders.DmozSpider',
                'spider': 'DmozSpider'
        }
]

Usage

It looks very similar to a flask app but since Scrapy depends on the python twisted package, we need to run our flask app with twisted:

from twisted.web.wsgi import WSGIResource
from twisted.web.server import Site
from twisted.internet import reactor
from arachne import Arachne

app = Arachne(__name__)

resource = WSGIResource(reactor, reactor.getThreadPool(), app)
site = Site(resource)
reactor.listenTCP(8080, site)

if __name__ == '__main__':
        reactor.run()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

ArachneScrapy-0.6.3-py2-none-any.whl (14.0 kB view details)

Uploaded Python 2

File details

Details for the file ArachneScrapy-0.6.3-py2-none-any.whl.

File metadata

  • Download URL: ArachneScrapy-0.6.3-py2-none-any.whl
  • Upload date:
  • Size: 14.0 kB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/20.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/2.7.9

File hashes

Hashes for ArachneScrapy-0.6.3-py2-none-any.whl
Algorithm Hash digest
SHA256 76478f3f176b56314ad383903a5b5d85f70a5c6641fd454fcd4ad3a2c7436c3e
MD5 c7d95e06254db181f52cebc8fd2c2cbe
BLAKE2b-256 4deb2300bc64870d8f7b90a0035f28f41ea115501759f6655348f5735a70052a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page