scrapy-inline-requests·PyPI

Scrapy decorator for inline requests

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
Programming Language

Project description

This module provides a decorator that allows to write coroutine-like spider callbacks.

The code is experimental, might not work in all cases and even might be hard to debug.

Example:

from inline_requests import inline_requests

class MySpider(CrawlSpider):

  ...

  @inline_requests
  def parse_item(self, response):
    item = self.build_item(response)

    # scrape more information
    response = yield Request(response.url + '?info')
    item['info'] = self.extract_info(response)

    # scrape pictures
    response = yield Request(response.url + '?pictures')
    item['pictures'] = self.extract_pictures(response)

    # a request that might fail (dns error, network timeout, error 404/500, etc)
    try:
      response = yield Request(response.url + '?protected')
    except Exception as e:
      log.err(e, spider=self)
    else:
      item['protected'] = self.extract_protected_info(response)

    # finally yield the item
    yield item

Example Project

The example directory includes a example spider for StackOverflow.com:

cd example
scrapy crawl stackoverflow

Requirements

Python 2.7+, 3.4+
Scrapy 1.0+

Known Issues

Middlewares can drop or ignore non-200 status responses causing the callback to not continue its execution. This can be overcome by using the flag handle_httpstatus_all. See the httperror middleware documentation.
High concurrency and large responses can cause higher memory usage.

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
Programming Language

Release history Release notifications | RSS feed

0.3.1

Jul 5, 2016

0.3.0

Jun 24, 2016

This version

0.2.0

Jun 23, 2016

0.1.2

Feb 4, 2013

0.1.1

Feb 4, 2013

0.1

Feb 3, 2012

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapy-inline-requests-0.2.0.tar.gz (2.9 kB view details)

Uploaded Jun 23, 2016 Source

Built Distribution

scrapy_inline_requests-0.2.0-py2.py3-none-any.whl (4.8 kB view details)

Uploaded Jun 23, 2016 Python 2Python 3

File details

Details for the file scrapy-inline-requests-0.2.0.tar.gz.

File metadata

Download URL: scrapy-inline-requests-0.2.0.tar.gz
Upload date: Jun 23, 2016
Size: 2.9 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for scrapy-inline-requests-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`b8995ab28eab9aaa5324f0a11bc2197ad39e952413d038f2b4e4051a4855bd5a`
MD5	`70bc4c8061c5bbdd6dcc25b658cfbebe`
BLAKE2b-256	`8eb49691dde4dc2092d6211b128caf23069d6e36815ac839eed70ff28b46d151`

See more details on using hashes here.

File details

Details for the file scrapy_inline_requests-0.2.0-py2.py3-none-any.whl.

File metadata

Download URL: scrapy_inline_requests-0.2.0-py2.py3-none-any.whl
Upload date: Jun 23, 2016
Size: 4.8 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No

File hashes

Hashes for scrapy_inline_requests-0.2.0-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`5a827bb1e8f287e80be29b003edb7957a07fb11278c44ea2e74c12f708b56fdd`
MD5	`1b0d9c624bc2cc9139d029f556af815c`
BLAKE2b-256	`404bfc449f58847a67f140d0c48c014138f863f5c4d36672a3cb7bdf3757d9b2`

See more details on using hashes here.

scrapy-inline-requests 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Example Project

Requirements

Known Issues

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes