Baidu Image Spider
Project description
Baidu Image Crawling
一个超级轻量的百度图片爬虫, modified from https://github.com/kong36088/BaiduImageCrawling
安装
pip install baidu_image_crawling
Python使用
from baidu_image_crawling.main import Crawler
crawler = Crawler(0.05, save_dir="outputs") # 抓取延迟为 0.05
# 抓取关键词为 “美女”,总数为2页,开始页码为1,每页 30 张, 即总共2*30=60张
crawler(word="美女", total_page=2, start_page=1, per_page=30)
终端使用
baidu_image_crawling -w 美女 -tp 1 -sp 1 -pp 2
查看参数文档:
$ baidu_image_crawling -h
usage: baidu_image_crawling [-h] -w WORD -tp TOTAL_PAGE -sp START_PAGE [-pp [PER_PAGE]] [-sd SAVE_DIR] [-d DELAY]
options:
-h, --help show this help message and exit
-w WORD, --word WORD 抓取关键词
-tp TOTAL_PAGE, --total_page TOTAL_PAGE
需要抓取的总页数
-sp START_PAGE, --start_page START_PAGE
起始页数
-pp [PER_PAGE], --per_page [PER_PAGE]
每页大小
-sd SAVE_DIR, --save_dir SAVE_DIR
图片保存目录
-d DELAY, --delay DELAY
抓取延时(间隔)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file baidu_image_crawling-0.0.1-py3-none-any.whl.
File metadata
- Download URL: baidu_image_crawling-0.0.1-py3-none-any.whl
- Upload date:
- Size: 6.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.9.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c955d3dce0395ae1abe88f8effc013a3c8052fa973458cb8749ff7861e2b9c4
|
|
| MD5 |
8fdfddc0a9743bfc56f24383baae2635
|
|
| BLAKE2b-256 |
89674631eeec8dbe49b67eac2378c8aaeeffc7759e3c7f326bee847a2d5d8183
|