Skip to main content

Web Utils for browsing and scraping

Project description

WebU

Web Utils for browsing and scraping.

Install

pip install webu --upgrade

默认安装现在只包含最基础的运行依赖:requeststclogger。 浏览器自动化、FastAPI 服务、面板、代理池、验证码识别、MongoDB、Hugging Face 等重依赖都拆到了可选 extras 里,按需安装即可。

Common Installs

基础能力,适合 LLMClientGeminiClient 这类纯 HTTP 客户端:

pip install -U webu

HTML / 搜索结果解析,适合 webu.google_api.parserwebu.gemini.parser

pip install -U "webu[parsing]"

DrissionPage 浏览器能力,适合 webu.browsers.chromewebu.searches.*

pip install -U "webu[browser]"

嵌入向量客户端,适合 webu.embed

pip install -U "webu[embed]"

CAPTCHA 自动解题,适合 webu.captcha

pip install -U "webu[captcha]"
playwright install chromium

Service Installs

Google Search API 服务 ggsc

pip install -U "webu[google-api]"
playwright install chromium

如果还需要内置 Dash 面板:

pip install -U "webu[google-api,google-api-panel]"
playwright install chromium

如果需要自动处理 reCAPTCHA 图片题,再额外加上 captcha

pip install -U "webu[google-api,captcha]"
playwright install chromium

Google Docker / HF Spaces 工具 ggdk

pip install -U "webu[google-docker]"
playwright install chromium

如果还需要内置 Dash 面板:

pip install -U "webu[google-docker,google-docker-panel]"
playwright install chromium

Google Hub 调度服务 gghb

pip install -U "webu[google-hub]"

如果还需要内置 Dash 面板:

pip install -U "webu[google-hub,google-hub-panel]"

Gemini 浏览器服务端能力:

pip install -U "webu[gemini]"
playwright install chromium

Proxy API 服务 pxsc

pip install -U "webu[proxy-api]"

WARP API 服务 cfwp

pip install -U "webu[warp-api]"

Cloudflare Tunnel 工具 cftn

pip install -U "webu[cf-tunnel]"

IPv6 相关能力:

pip install -U "webu[ipv6]"

安装全部功能:

pip install -U "webu[all]"
playwright install chromium

Extra Summary

Extra 适用模块 / 命令 说明
parsing webu.google_api.parser, webu.gemini.parser 仅安装 HTML 解析相关依赖
browser / searches webu.browsers.chrome, webu.searches.* DrissionPage + 虚拟显示
embed webu.embed 仅安装 numpy
captcha webu.captcha Playwright + OpenCV + numpy + httpx
fastapi webu.fastapis.* FastAPI / Uvicorn / Pydantic
dashboard Google API / Hub 面板 Dash + A2WSGI
proxy webu.proxy_api.*, webu.google_api.proxy_manager aiohttp / SOCKS / MongoDB
cf-tunnel webu.cf_tunnel.*, cftn Cloudflare Tunnel CLI 相关依赖
gemini webu.gemini.* Gemini 浏览器服务端所需依赖
google-api webu.google_api.*, ggsc Google 搜索服务本体,不含 CAPTCHA 图像解题和 Dash 面板
google-api-panel Google API panel Google API 的 Dash 面板依赖
google-docker webu.google_docker.*, ggdk Google Docker / HF Spaces 工具本体,不含 Dash 面板
google-docker-panel Google Docker panel Google Docker 的 Dash 面板依赖
google-hub webu.google_hub.*, gghb Hub 调度服务本体,不含 Dash 面板
google-hub-panel Google Hub panel Google Hub 的 Dash 面板依赖
proxy-api webu.proxy_api.*, pxsc 代理采集、校验、服务
warp-api webu.warp_api.*, cfwp WARP 管理服务
ipv6 webu.ipv6.* IPv6 路由、会话、服务
all 全部模块 安装所有可选依赖
dev 测试 / 开发 pytest + pytest-asyncio

Combining Extras

可以一次安装多个功能组:

pip install -U "webu[google-api,google-api-panel,captcha,google-hub]"

Notes

  • playwright 只是 Python 包;首次使用浏览器相关功能后,仍需执行 playwright install chromium
  • google-api / google-hub / google-docker 现在即使未安装 Dash 也能启动服务,只是不会挂载 panel。
  • import webu 和若干子包入口现在采用惰性导入,不会再因为未安装某个可选依赖就把整个包导入失败。
  • 如果只需要某个轻量子模块,尽量直接安装对应 extra,不要默认使用 webu[all]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

webu-1.3.3.tar.gz (299.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

webu-1.3.3-py3-none-any.whl (335.8 kB view details)

Uploaded Python 3

File details

Details for the file webu-1.3.3.tar.gz.

File metadata

  • Download URL: webu-1.3.3.tar.gz
  • Upload date:
  • Size: 299.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for webu-1.3.3.tar.gz
Algorithm Hash digest
SHA256 0c3318544ea6fbd95696f8ff99420f0d8ef13bf002c07979bbaa29a39f335788
MD5 f963e5b45143419ebe10323233280e18
BLAKE2b-256 474b515c3b920b34d8ca4ee03eaa69af3c71d84cb74bd42fc3d53a905f7f7f99

See more details on using hashes here.

File details

Details for the file webu-1.3.3-py3-none-any.whl.

File metadata

  • Download URL: webu-1.3.3-py3-none-any.whl
  • Upload date:
  • Size: 335.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for webu-1.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 63ac14671c1f88f70090f68ab1b4f2068baee08b32a7e79a7838a3d9030fa704
MD5 6a69d3b9672d4733813412bf31357b0a
BLAKE2b-256 f42ce6a876e3b8792c153e9c4f6eaeaeda8e475598d11a72901b911976ea487a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page