Skip to main content

一个简单快速的异步爬虫框架

Project description

HSSP 爬虫框架

Ask DeepWiki

一个基于python asyncio开发的爬虫框架 (开发中)

作者

特性

  • 使用scrapy框架的选择器 parsel 作为内置网页选择器
  • 基于tenacity的自动异常重试
  • 基于fake-useragent的可选随机UA
  • 可选的多种下载器: httpxaiohttprequestscurl-cffirequests-go
  • 请求前、响应后、重试后监听

路线

  • 增加其他解析器
  • 在情求过程中临时更换下载器:比如net初始化时使用的是httpx下载器,其中一个情求要临时切换至 DrissionPage, 其他的依旧是httpx
  • 支持 DrissionPageplaywright 浏览器渲染的下载器
  • 下载器支持更多配置项及自定义项
  • 编写详细使用文档

安装

使用 pip 安装 hssp

pip install hssp

使用 uv 安装 hssp

uv add hssp

支持

如需支持,请发送电子邮件至 xhrtxh@gmail.com

开发测试

项目使用uv管理依赖,需先安装uv

    rye sync

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hssp-0.4.18.tar.gz (17.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hssp-0.4.18-py3-none-any.whl (23.0 kB view details)

Uploaded Python 3

File details

Details for the file hssp-0.4.18.tar.gz.

File metadata

  • Download URL: hssp-0.4.18.tar.gz
  • Upload date:
  • Size: 17.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for hssp-0.4.18.tar.gz
Algorithm Hash digest
SHA256 5032ecfb90d8a40476080da435b8d1934c66490296252bd9d83ec685d7271fdf
MD5 cae04cd1006d7fcf434fe2b0c2afcdb6
BLAKE2b-256 1b49bcd2d8e6e79b53e8dd89882e0d0763d252c58cae7f724a8b353836ad4da7

See more details on using hashes here.

File details

Details for the file hssp-0.4.18-py3-none-any.whl.

File metadata

  • Download URL: hssp-0.4.18-py3-none-any.whl
  • Upload date:
  • Size: 23.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for hssp-0.4.18-py3-none-any.whl
Algorithm Hash digest
SHA256 95c7fe09d1e1611c11bd73a80093fa4945dd8188b015e1c0c8ef28b903799517
MD5 a29a79b0c5ed712f6725ca34ea459ee9
BLAKE2b-256 cdc0e9ecead4df168d3c4db493cb00fa3fb6591bb388ce805e366ef72e67d1e0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page