fast crawl web image source
Project description
# crawl_image
## Introduction
多线程快速抓取网页所有图片资源到指定路径,原理是抓取img标签的src,再整合域名成资源完整url,分发到程序线程去下载。
## Example
```py
from crawl_image.model.img_crawl_model import ImgCrawlModel
from crawl_image.img_crawl import crawl_start
img_crawl_model = ImgCrawlModel()
img_crawl_model.url = 'http://huaban.com/'
img_crawl_model.img_save_path = 'D:/crawl/image'
crawl_start(img_crawl_model)
```
## Features
- 高速下载
- 抓取所有图片
- 自解网页编码
- 过滤图片类型
## Communication
- 未来已来 203737026
## Copyright and License
code for you
## Introduction
多线程快速抓取网页所有图片资源到指定路径,原理是抓取img标签的src,再整合域名成资源完整url,分发到程序线程去下载。
## Example
```py
from crawl_image.model.img_crawl_model import ImgCrawlModel
from crawl_image.img_crawl import crawl_start
img_crawl_model = ImgCrawlModel()
img_crawl_model.url = 'http://huaban.com/'
img_crawl_model.img_save_path = 'D:/crawl/image'
crawl_start(img_crawl_model)
```
## Features
- 高速下载
- 抓取所有图片
- 自解网页编码
- 过滤图片类型
## Communication
- 未来已来 203737026
## Copyright and License
code for you
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
crawl_image-0.0.5.tar.gz
(8.8 kB
view details)
File details
Details for the file crawl_image-0.0.5.tar.gz
.
File metadata
- Download URL: crawl_image-0.0.5.tar.gz
- Upload date:
- Size: 8.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.18.4 setuptools/28.8.0 requests-toolbelt/0.8.0 tqdm/4.29.0 CPython/3.6.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 758bb773af519bbf2cd513a977c9b07c750e62a6ff980c2ec35a75dd0e3e308b |
|
MD5 | 376466e2c25f56d0fca3511b56e5220e |
|
BLAKE2b-256 | e3929675882c109b7e29ae90ff8aba8feec3e7ed6a95277d157872e1e90201a5 |