Export scraped items of different types to multiple feeds.
Project description
scrapy-multifeedexporter
========================
This `Scrapy <http://scrapy.org/>`__ extension exports scraped items of
different types to multiple feeds. By default each item gets its own
feed.
Installation
------------
.. code-block:: bash
$ pip install scrapy-multifeedexporter
Configuration
-------------
You'll have to switch the default ``FeedExporter`` with
``MultiFeedExporter`` by adding the following lines to the
``settings.py`` file of your spider:
.. code:: python
from multifeedexporter import MultiFeedExporter
EXTENSIONS = {
'scrapy.contrib.feedexport.FeedExporter': None,
'multifeedexporter.MultiFeedExporter': 500,
}
# Automatically configure available item names from your module
MULTIFEEDEXPORTER_ITEMS = MultiFeedExporter.get_bot_items(BOT_NAME)
Usage
-----
When calling ``scrapy crawl`` you need to use the ``%(item_name)s``
placeholder in the output file/URI name. The following calls to
``scrapy crawl`` demonstrate the placeholder:
.. code:: bash
$ scrapy crawl -o "spider_name_%(item_name)s.csv" -t csv spider_name
$ scrapy crawl -o "ftp://foo:bar@example.com/spider_name_%(item_name)s.csv" -t csv spider_name
If you omit the placeholder, all items will be placed in one file.
License
-------
scrapy-multifeedexporter is published under MIT license
========================
This `Scrapy <http://scrapy.org/>`__ extension exports scraped items of
different types to multiple feeds. By default each item gets its own
feed.
Installation
------------
.. code-block:: bash
$ pip install scrapy-multifeedexporter
Configuration
-------------
You'll have to switch the default ``FeedExporter`` with
``MultiFeedExporter`` by adding the following lines to the
``settings.py`` file of your spider:
.. code:: python
from multifeedexporter import MultiFeedExporter
EXTENSIONS = {
'scrapy.contrib.feedexport.FeedExporter': None,
'multifeedexporter.MultiFeedExporter': 500,
}
# Automatically configure available item names from your module
MULTIFEEDEXPORTER_ITEMS = MultiFeedExporter.get_bot_items(BOT_NAME)
Usage
-----
When calling ``scrapy crawl`` you need to use the ``%(item_name)s``
placeholder in the output file/URI name. The following calls to
``scrapy crawl`` demonstrate the placeholder:
.. code:: bash
$ scrapy crawl -o "spider_name_%(item_name)s.csv" -t csv spider_name
$ scrapy crawl -o "ftp://foo:bar@example.com/spider_name_%(item_name)s.csv" -t csv spider_name
If you omit the placeholder, all items will be placed in one file.
License
-------
scrapy-multifeedexporter is published under MIT license
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Close
Hashes for scrapy-multifeedexporter-0.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | fc19904e4e956b5f2ce785a1f85a328aef54f00a3f1f31062a3a7c1843ae2386 |
|
MD5 | 47f9382d1863b2389159b0479ce9114a |
|
BLAKE2b-256 | aadbd39ff24f9bec5bdc63fc05a87d5106b7d4a0df684045e9129490aa215367 |