A scrapy pipeline which stores files using folder trees.
Project description
scrapy-folder-tree
This is a scrapy pipeline that provides an easy way to store files and images using various folder structures.
Supported folder structures:
Given this scraped file: 05b40af07cb3284506acbf395452e0e93bfc94c8.jpg
, you can choose the following folder structures:
Using file name
class: scrapy-folder-tree.ImagesHashTreePipeline
full
├── 0
. ├── 5
. . ├── b
. . . ├── 05b40af07cb3284506acbf395452e0e93bfc94c8.jpg
Using crawling time
class: scrapy-folder-tree.ImagesTimeTreePipeline
full
├── 0
. ├── 11
. . ├── 48
. . . ├── 05b40af07cb3284506acbf395452e0e93bfc94c8.jpg
Using crawling date
class: scrapy-folder-tree.ImagesDateTreePipeline
full
├── 2022
. ├── 1
. . ├── 24
. . . ├── 05b40af07cb3284506acbf395452e0e93bfc94c8.jpg
Installation
pip install scrapy_folder_tree
Usage
Use the following settings in your project:
ITEM_PIPELINES = {
'scrapy_folder_tree.FilesHashTreePipeline': 300
}
FOLDER_TREE_DEPTH = 3
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for scrapy_folder_tree-0.1.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a407ffc0ba26d66cbb870880a1fe90a59e8e8a2aec83e55f8a7cd24b40d83f63 |
|
MD5 | 4de8133bf21ff78c720243ac088ba76c |
|
BLAKE2b-256 | 7263ee311de80558f06c38ab4132bf3ffdd26b781a5bbdd049955fe59f4a6aab |