一个简单快速的异步爬虫框架
Project description
HSSP 爬虫框架
一个基于python asyncio开发的爬虫框架 (开发中)
作者
特性
- 使用scrapy框架的选择器 parsel 作为内置网页选择器
- 基于tenacity的自动异常重试
- 基于fake-useragent的可选随机UA
- 可选的多种下载器: httpx、aiohttp、requests、curl-cffi、requests-go
- 请求前、响应后、重试后监听
路线
- 增加其他解析器
- 在情求过程中临时更换下载器:比如net初始化时使用的是httpx下载器,其中一个情求要临时切换至
DrissionPage, 其他的依旧是httpx - 支持
DrissionPage、playwright浏览器渲染的下载器 - 下载器支持更多配置项及自定义项
- 编写详细使用文档
安装
使用 pip 安装 hssp
pip install hssp
使用 uv 安装 hssp
uv add hssp
支持
如需支持,请发送电子邮件至 xhrtxh@gmail.com。
开发测试
项目使用uv管理依赖,需先安装uv
rye sync
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
hssp-0.4.18.tar.gz
(17.0 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
hssp-0.4.18-py3-none-any.whl
(23.0 kB
view details)
File details
Details for the file hssp-0.4.18.tar.gz.
File metadata
- Download URL: hssp-0.4.18.tar.gz
- Upload date:
- Size: 17.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5032ecfb90d8a40476080da435b8d1934c66490296252bd9d83ec685d7271fdf
|
|
| MD5 |
cae04cd1006d7fcf434fe2b0c2afcdb6
|
|
| BLAKE2b-256 |
1b49bcd2d8e6e79b53e8dd89882e0d0763d252c58cae7f724a8b353836ad4da7
|
File details
Details for the file hssp-0.4.18-py3-none-any.whl.
File metadata
- Download URL: hssp-0.4.18-py3-none-any.whl
- Upload date:
- Size: 23.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
95c7fe09d1e1611c11bd73a80093fa4945dd8188b015e1c0c8ef28b903799517
|
|
| MD5 |
a29a79b0c5ed712f6725ca34ea459ee9
|
|
| BLAKE2b-256 |
cdc0e9ecead4df168d3c4db493cb00fa3fb6591bb388ce805e366ef72e67d1e0
|