Skip to main content

quickly build your crawler

Project description

简介

Bricks 旨在将爬虫开发变得像搭建积木一样简单而有趣。这个框架的核心理念是提供一个直观、高效的方式来构建复杂的网络爬虫,同时保持代码的简洁和可维护性。无论您是刚入门的新手还是经验丰富的专家,Bricks 都能让您轻松地搭建起强大的爬虫,满足从简单数据抓取到复杂网络爬取的各种需求。

通过精心设计的接口和模块化的结构,Bricks 使得组合、扩展和维护爬虫变得前所未有的容易。您可以像搭积木一样,快速组合出适合您需求的爬虫结构,无需深入底层细节,同时也能享受到定制化和控制的乐趣。使用 Bricks,您将体验到无与伦比的开发效率和灵活性,让爬虫开发不再是一件费时费力的任务。

特性

Bricks 拥有以下特性

  • 基于事件触发拓展爬虫,在定义好自己爬虫主体逻辑的情况下,可以在不修改核心代码的情况下,在请求前后,存储前后等多个事件接口进行拓展,让爬虫流程更加清晰
  • 多个爬虫基类,可以有纯代码是的 air 爬虫,还有流程化自定义的配置式 form 爬虫,还有固定流程的配置式 template 爬虫
  • 丰富的解析器,包括 json / xpath / jsonpath / regex / json / 自定义,并且支持配置式书写解析规则
  • 灵活可拓展的下载器,目前内置的下载器为 curl-cffi ,并且还有可选的 requests / requests-go / pycurl / Playwright, 开发者可以根据规范自己定制下载器
  • 灵活的调度器,调度器支持处理同步任务和异步任务,并且支持根据当前任务数量自动调节 Worker 数量
  • 内置 LocalRedis 两种任务队列,以便应用单机和分布式爬虫

安装

安装最新代码

pip install git+https://github.com/KKKKKKKEM/bricks.git

安装正式版

pip install bricks-py

使用文档

具体文档请查看 WIKI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bricks-py-0.0.15.tar.gz (100.1 kB view details)

Uploaded Source

Built Distribution

bricks_py-0.0.15-py3-none-any.whl (128.0 kB view details)

Uploaded Python 3

File details

Details for the file bricks-py-0.0.15.tar.gz.

File metadata

  • Download URL: bricks-py-0.0.15.tar.gz
  • Upload date:
  • Size: 100.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.18

File hashes

Hashes for bricks-py-0.0.15.tar.gz
Algorithm Hash digest
SHA256 4d7825245b891f2b1a3327b2d17f35d2a58a770e5c0d30c4517775135b7ae2a9
MD5 b671c3f5a50cadd9cc274eaf89901889
BLAKE2b-256 905593bb8a91d343dc8823aca0694e4f80d21ca50e1f1275e8c2fa3954deb90d

See more details on using hashes here.

File details

Details for the file bricks_py-0.0.15-py3-none-any.whl.

File metadata

  • Download URL: bricks_py-0.0.15-py3-none-any.whl
  • Upload date:
  • Size: 128.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.18

File hashes

Hashes for bricks_py-0.0.15-py3-none-any.whl
Algorithm Hash digest
SHA256 d09833d7967ee2eafd8a185e05facb2a9d3ba3f296a56e5689f21ea3ec2a8965
MD5 20b9fcdd430253416c88f7bea14e2fdd
BLAKE2b-256 780112abf1f7712bc6f93b7b9fba1f352eb8701d9ac4bd815860bd037fe8d623

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page