Scrapy spider middleware to split an item into multiple items on a multi-valued key
Project description
SplitVariantsMiddleware is a Scrapy spider middleware used to split single items into multiple items when they have a “variants” key with multiple values.
Example usage
Let’s assume your spider outputs an item with different size options (from an ecommerce website for example):
item = {"id": 12, "name": "Big chair", "variants": [{"size": "XL", "price": 200, "currency": "USD"}, {"size": "L", "price": 100, "currency": "USD"}]}
When you enable SplitVariantsMiddleware, this single item will become 2 items with the different variants values into a different item:
{"id": 12, "name": "Big chair", "size": "XL", "price": 200, "currency": "USD"} {"id": 12, "name": "Big chair", "size": "L", "price": 100, "currency": "USD"}
Installation
Install scrapy-splitvariants using pip:
$ pip install scrapy-splitvariants
Configuration
Add SplitVariantsMiddleware by including it in SPIDER_MIDDLEWARES in your settings.py file:
SPIDER_MIDDLEWARES = { 'scrapy_splitvariants.SplitVariantsMiddleware': 100, }
Here, priority 100 is just an example. Set its value depending on other middlewares you may have enabled already.
Enable the middleware using SPLITVARIANTS_ENABLED set to True in your setting.py.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for scrapy-splitvariants-1.0.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | d86c2a0373d97f9bc40062624a661326ee65621f2ad015cfbf75bf01452b7564 |
|
MD5 | d9ff0c1d9c29cf805a381b2258ef437f |
|
BLAKE2b-256 | f2fe1f4402a8a929b5392f8fd3dbbefbb89a174e8d05afe272d44dae3f66eb3b |
Hashes for scrapy_splitvariants-1.0.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 28e12fca8e526a59a8a9d84df2c2f5066590a3313106905e3e95185504f8274b |
|
MD5 | ff938535643a4f0f3485952f6cd919d3 |
|
BLAKE2b-256 | 7e79be3e50779d85c3bae05418428cad7dda0f7f4dd54754e14462e856f2efe4 |