Skip to main content

A Python wrapper for working with the Scrapyd API

Project description

The PyPI version Built Status on Travis-CI Coverage Status on Coveralls Documentation Status on ReadTheDocs

A Python wrapper for working with Scrapyd’s API.

Allows a Python application to talk to, and therefore control, the Scrapy daemon: Scrapyd.

Install

Easiest installation is via pip:

pip install python-scrapyd-api

Quick Usage

Please refer to the full documentation for more detailed usage but to get you started:

>>> from scrapyd_api import ScrapydAPI
>>> scrapyd = ScrapydAPI('http://localhost:6800')

Add a project egg as a new version:

>>> egg = open('some_egg.egg')
>>> scrapyd.add_version('project_name', 'version_name', egg)
# Returns the number of spiders in the project.
3
>>> egg.close()

Cancel a scheduled job:

>>> scrapyd.cancel('project_name', '14a6599ef67111e38a0e080027880ca6')
# Returns True if the request was met with an OK response.
True

Delete a project and all sibling versions:

>>> scrapyd.delete_project('project_name')
# Returns True if the request was met with an OK response.
True

Delete a version of a project:

>>> scrapyd.delete_version('project_name', 'version_name')
# Returns True if the request was met with an OK response.
True

List all jobs registered:

>>> scrapyd.list_jobs('project_name')
# Returns a dict of running, finished and pending job lists.
{
    'pending': [
        {
            u'id': u'24c35...f12ae',
            u'spider': u'spider_name'
        },
    ],
    'running': [
        {
            u'id': u'14a65...b27ce',
            u'spider': u'spider_name',
            u'start_time': u'2014-06-17 22:45:31.975358'
        },
    ],
    'finished': [
        {
            u'id': u'34c23...b21ba',
            u'spider': u'spider_name',
            u'start_time': u'2014-06-17 22:45:31.975358',
            u'end_time': u'2014-06-23 14:01:18.209680'
        }
    ]
}

List all projects registered:

>>> scrapyd.list_projects()
[u'ecom_project', u'estate_agent_project', u'car_project']

List all spiders available to a given project:

>>> scrapyd.list_spiders('project_name')
[u'raw_spider', u'js_enhanced_spider', u'selenium_spider']

List all versions registered to a given project:

>>> scrapyd.list_versions('project_name'):
[u'345', u'346', u'347', u'348']

Schedule a job to run with a specific spider:

# Schedule a job to run with a specific spider.
>>> scrapyd.schedule('project_name', 'spider_name')
# Returns the Scrapyd job id.
u'14a6599ef67111e38a0e080027880ca6'

Schedule a job to run while passing override settings:

>>> settings = {'DOWNLOAD_DELAY': 2}
>>> scrapyd.schedule('project_name', 'spider_name', settings=settings)
u'25b6588ef67333e38a0e080027880de7'

Schedule a job to run while passing extra attributes to spider initialisation:

>>> scrapyd.schedule('project_name', 'spider_name', extra_attribute='value')
# NB: 'project', 'spider' and 'settings' are reserved kwargs for this
# method and therefore these names should be avoided when trying to pass
# extra attributes to the spider init.
u'25b6588ef67333e38a0e080027880de7'

Contributing code and/or running the tests

Please see DEVELOPMENT.rst or refer to the full documentation.

License

2-clause BSD. See the full LICENSE.

History

0.1.0 (2014-09-16)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python-scrapyd-api-0.1.0.tar.gz (19.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

python_scrapyd_api-0.1.0-py2.py3-none-any.whl (9.4 kB view details)

Uploaded Python 2Python 3

File details

Details for the file python-scrapyd-api-0.1.0.tar.gz.

File metadata

File hashes

Hashes for python-scrapyd-api-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9998ccec6791648e5fc382478f48ba7b4a05725dc57aa1ca9136211c4583e3c6
MD5 cb45354a1545eddd4124c32ea798a52a
BLAKE2b-256 db30a243f832bfd95db556ea494bb5372f5dbbfc9c26a4c38f7ab01dff57c4ba

See more details on using hashes here.

File details

Details for the file python_scrapyd_api-0.1.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for python_scrapyd_api-0.1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 bb5f11d461c930c5541c74e8c1fb5a0e7c86971773ed72727b2710e1d36b28f1
MD5 e3a810aefe1c62c76ac34ba5b216e02e
BLAKE2b-256 cc3d872f6e2cba4354b599eb628f7fd9f2d53f10364bd4a4de499bc2b99d16b1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page