Skip to main content

Export scraped items of different types to multiple feeds.

Project description

scrapy-multifeedexporter
========================

This `Scrapy <http://scrapy.org/>`__ extension exports scraped items of
different types to multiple feeds. By default each item gets its own
feed.

Installation
------------

.. code-block:: bash

$ pip install scrapy-multifeedexporter

Configuration
-------------

You'll have to switch the default ``FeedExporter`` with
``MultiFeedExporter`` by adding the following lines to the
``settings.py`` file of your spider:

.. code:: python

from multifeedexporter import MultiFeedExporter

EXTENSIONS = {
'scrapy.contrib.feedexport.FeedExporter': None,
'multifeedexporter.MultiFeedExporter': 500,
}

# Automatically configure available item names from your module
MULTIFEEDEXPORTER_ITEMS = MultiFeedExporter.get_bot_items(BOT_NAME)

Usage
-----

When calling ``scrapy crawl`` you need to use the ``%(item_name)s``
placeholder in the output file/URI name. The following calls to
``scrapy crawl`` demonstrate the placeholder:

.. code:: bash

$ scrapy crawl -o "spider_name_%(item_name)s.csv" -t csv spider_name
$ scrapy crawl -o "ftp://foo:bar@example.com/spider_name_%(item_name)s.csv" -t csv spider_name

If you omit the placeholder, all items will be placed in one file.

License
-------

scrapy-multifeedexporter is published under MIT license

Project details


Release history Release notifications

History Node

0.1.1

This version
History Node

0.1.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
scrapy-multifeedexporter-0.1.0.tar.gz (2.7 kB) Copy SHA256 hash SHA256 Source None Sep 4, 2014

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page