A simple scrapy library for Python.
Project description
shiertier_scrapy
English | 中文
Introduction
shiertier_scrapy is a Python library designed to simplify the process of downloading images from the web. It provides a robust and flexible interface for handling various HTTP status codes, retries, and image validation. This library is particularly useful for web scraping tasks where image downloads are required.
Installation
You can install shiertier_scrapy via pip:
pip install git+https://github.com/shiertier-utils/shiertier_scrapy.git
Please note that this project is still under development.
Environment Variables and Storage Location
Environment Variables
SCRAPY_SAVE_DIR: The directory where downloaded images will be saved. If not provided, the current working directory will be used.
Setting the Storage Location
You can specify the storage location by setting the SCRAPY_SAVE_DIR environment variable:
export SCRAPY_SAVE_DIR=/path/to/save_directory
Alternatively, you can pass the save_dir parameter when initializing the ScrapyClientBase class:
from shiertier_scrapy import ScrapyClientBase
# Initialize with a custom save directory
scrapy_client = ScrapyClientBase(save_dir='/path/to/save_directory')
Usage
Downloading a Single Image
You can download a single image using the download_one method. This method requires the URL of the image and the desired save name. It also supports retries and sleep time between retries.
from shiertier_scrapy import easy_scrapy_client
# Download a single image
easy_scrapy_client.download_one(url='http://example.com/image.jpg', save_name='image.jpg')
Downloading Multiple Images
You can download multiple images concurrently using the download_images method. This method requires a list of URLs and corresponding save names. It also supports retries and sleep time between retries.
from shiertier_scrapy import easy_scrapy_client
# URLs and save names
urls = ['http://example.com/image1.jpg', 'http://example.com/image2.jpg']
save_names = ['image1.jpg', 'image2.jpg']
# Download multiple images
easy_scrapy_client.download_images(urls=urls, save_names=save_names)
Dependencies
requestsPillowshiertier_loggertqdm
License
This project is released under the MIT License. See the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file shiertier_scrapy-0.0.5.tar.gz.
File metadata
- Download URL: shiertier_scrapy-0.0.5.tar.gz
- Upload date:
- Size: 5.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
56b16e6133dbbe5a02f53a53e8cdfcd8a9f61a21bd76448eaafdd990dd3881dd
|
|
| MD5 |
1f9228ea8603e8fff7ffee6668f92c05
|
|
| BLAKE2b-256 |
9f767f8f58faa59db57c425b9bf0dd6e2c6bc9bccc5b06aafd0809d027ec02e4
|
File details
Details for the file shiertier_scrapy-0.0.5-py3-none-any.whl.
File metadata
- Download URL: shiertier_scrapy-0.0.5-py3-none-any.whl
- Upload date:
- Size: 5.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1cef063cd5e773b8205ae00395814c681442b4eb7e6bb6f8b5ab401b9074fe55
|
|
| MD5 |
296f3b8813ae0af2e7089b5293b2290b
|
|
| BLAKE2b-256 |
127dab96901ee35f502c9f91a1269dda1d6d274e9d516f270b195c2d4554e8ec
|