Skip to main content

一个简单快速的异步爬虫框架

Project description

HSSP 爬虫框架

一个基于python asyncio开发的爬虫框架 (开发中)

作者

特性

  • 使用scrapy框架的选择器parsel作为内置网页选择器
  • 基于tenacity的自动异常重试
  • 基于fake-useragent的可选随机UA
  • 可选的多种下载器: httpx、aiohttp、requests、curl-cffi等
  • 请求前、响应后、重试后监听

计划

  • 在情求过程中临时更换下载器:比如net初始化时使用的是httpx下载器,其中一个情求要临时切换至 DrissionPage, 其他的依旧是httpx
  • 支持 DrissionPage 浏览器渲染的下载器
  • 支持 playwright 浏览器渲染的下载器
  • 针对curl-cffi使用更多了配置项及自定义项
  • 编写详细使用文档

安装

使用 pip 安装 hssp

pip install hssp

使用 rye 安装 hssp

rye add hssp

路线图

  • 基于fake-useragent的随机UA
  • curl-cff的支持
  • drissionpage的支持

支持

如需支持,请发送电子邮件至 xhrtxh@gmail.com

开发测试

项目使用rye管理依赖,需先安装rye

    rye sync

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hssp-0.4.7.tar.gz (34.6 kB view details)

Uploaded Source

Built Distribution

hssp-0.4.7-py3-none-any.whl (20.0 kB view details)

Uploaded Python 3

File details

Details for the file hssp-0.4.7.tar.gz.

File metadata

  • Download URL: hssp-0.4.7.tar.gz
  • Upload date:
  • Size: 34.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for hssp-0.4.7.tar.gz
Algorithm Hash digest
SHA256 186f05c3c2a57e224a8340b50e1c2fe66b7d710357cec68e4b1fba67aa2c46b8
MD5 0ab32e96b074b11b2cb4ece15e7e759c
BLAKE2b-256 139a2fe1e2a34e94ebc455dbf4d807cb72548ae163831105bf88255f59e0828d

See more details on using hashes here.

File details

Details for the file hssp-0.4.7-py3-none-any.whl.

File metadata

  • Download URL: hssp-0.4.7-py3-none-any.whl
  • Upload date:
  • Size: 20.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for hssp-0.4.7-py3-none-any.whl
Algorithm Hash digest
SHA256 9a3a4b8eab3e7d53ee5589585ba870bae34f043cdeefe82fe2551f5560da4273
MD5 d841e0d749543dac268a823d3d46c6fe
BLAKE2b-256 dc0ea6068350f1c30cb2fa58de4d923ff6e00a2a7c9d2fdbd0d894722333ba7a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page