Scrapy Item Record Extension
Project description
os-scrapy-record
This project provide extensions to process Response/Failure, generate standard Item.
Install
pip install os-scrapy-record
You can run example spider directly in the project root path
scrapy crawl example
APIs
-
os_scrapy_record.ResponseCallback
- the
callback
method of this extension will replace the defaultRequest.callback
, process Response and generate FetchRecord - the
callback
method will not work when the request already set callback function - the
callback
method will override theparse
method of spider - enable extension in the project settings.py file:
EXTENSIONS = { "os_scrapy_record.ResponseCallback": 1, }
- the
-
os_scrapy_record.ResponseErrback
- the
errback
method of this extension will replace the defaultRequest.errback
, process Failure and generate FetchRecord - the
errback
method will not work when the request already set errback function - enable extension in the project settings.py file:
EXTENSIONS = { "os_scrapy_record.ResponseErrback": 1, }
- the
-
os_scrapy_record.FetchRecord
This class is subclass of Item
the mumbers of this class are:
- request:
os_scrapy_record.items.RequestItem
, members: url, method, headers, body - meta:
dict
, request.meta, it is better to use lower case and '_' as separator as key - response:
os_scrapy_record.items.ResponseItem
,members: headers, body, status, ip_address(Scrapy 2.1.0+), failure
- request:
-
os_scrapy_record.fetch_status.FetchStatus
A mumber of ResponseItem, include HTTP, DNS, Network and user defined status. It is a two-tuple object: group and code. e.g, HTTP:200, DNS:-2, SERVER:111, RULE:16
Unit Tests
sh scripts/test.sh
License
MIT licensed.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
os-scrapy-record-0.0.15.tar.gz
(10.4 kB
view details)
File details
Details for the file os-scrapy-record-0.0.15.tar.gz
.
File metadata
- Download URL: os-scrapy-record-0.0.15.tar.gz
- Upload date:
- Size: 10.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bb82c668c620c524fa50bb8397bd5724ca27bd2bccc60289d9da26ba0e7863f5 |
|
MD5 | 0d9be141feb1d965a51582f1c7c73cef |
|
BLAKE2b-256 | 4d67504dac092a1fa7d1a5cf57f0d608928232db6a65d2c13a5d6438805d0e24 |