Scrapy extension for outputting scraped items to an Amazon SQS instance
Project description
scrapy-sqs-exporter
This is an extension to Scrapy to allow exporting of scraped items to an Amazon SQS instance.
Setup
After installing the package, the two classes defined in the library need to be added to the relevant sections of the settings file:
FEED_EXPORTERS = { 'sqs': 'sqsfeedexport.SQSExporter' } FEED_STORAGES = { 'sqs': 'sqsfeedexport.SQSFeedStorage' }
The FEED_STORAGES section uses a URL prefixed with sqs to differentiate it from other URI based storage options.
In the environment we also need to define four keys:
AWS_DEFAULT_REGION=eu-central-1 AWS_ACCESS_KEY_ID=... AWS_SECRET_ACCESS_KEY=... FEED_URI=sqs://foo FEED_FORMAT=sqs
The AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are the AWS credentials to be used, and AWS_DEFAULT_REGION is the region to default to for the SQS instance. FEED_URI is the name of the AWS SQS instance in the AWS_DEFAULT_REGION region for example:
AWS_DEFAULT_REGION=us-east-1 FEED_URI=sqs://bar FEED_FORMAT=sqs
would refer to a queue name bar in the us-east-1` region.
Finally, the FEED_FORMAT option makes the Scrapy spiders use the SQSExporter class.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for scrapy-sqs-exporter-1.0.3.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1c35863c9c70a48e130dd8d049f21689a2085b6f7fda372bfc823156c2357364 |
|
MD5 | 73f124b84bd975819d261766e8ace751 |
|
BLAKE2b-256 | 6ea249f90bc269066c0cc1844099178cf6939b50b0b996484bd03683a3790b84 |
Hashes for scrapy_sqs_exporter-1.0.3-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4916db88ec28c7e6fcd3a14392bb99ee86e8ea64f2a20798980263df41f14dff |
|
MD5 | dbfac71d641cdbf3192da783ece4ac80 |
|
BLAKE2b-256 | fa7ad2cb6e53f885a35daf4f81edf31a88dae3f5dd0d7f785602f6d8878579b9 |