Skip to main content

Asynchronous crawler micro-framework based on python.

Project description

hoopa

简介

hoopa 是一个轻量、快速的异步分布式爬虫框架

  • 支持内存、redis 的优先级队列
  • 支持 aiohttp、 httpx、requests 等 HTTP 库
  • 支持断点续传

兼容同步和异步代码,不习惯异步的,可以使用同步写,但是要注意的是不能在异步方法里面进行阻塞的操作

自用框架,不保证稳定性,请勿用于生产环境

文档地址:https://fishtn.github.io/hoopa/

环境要求:

  • Python 3.7.0+
  • Works on Linux, Windows, macOS

安装

# For Linux & Mac
pip install -U hoopa[uvloop]

# For Windows
pip install -U hoopa

开始

创建爬虫

hoopa create -s first_spider

然后添加 url:http://httpbin.org/get

import hoopa


class FirstSpider(hoopa.Spider):
    name = "first"
    start_urls = ["http://httpbin.org/get"]

    def parse(self, request, response):
        print(response)


if __name__ == "__main__":
    FirstSpider.start()

todo

  • 监控平台
  • 远程部署
  • 任务调度

感谢

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hoopa-0.1.18.tar.gz (39.9 kB view details)

Uploaded Source

Built Distribution

hoopa-0.1.18-py3-none-any.whl (52.7 kB view details)

Uploaded Python 3

File details

Details for the file hoopa-0.1.18.tar.gz.

File metadata

  • Download URL: hoopa-0.1.18.tar.gz
  • Upload date:
  • Size: 39.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for hoopa-0.1.18.tar.gz
Algorithm Hash digest
SHA256 9f87016f76aff5e2ef4f8dca17a0e0296acf4afc1d1aabd7d2e49cc8b9e747f3
MD5 c82fd0e7e42f93d0efd83adf2a53c627
BLAKE2b-256 4d6e8946524bf90281fa96fb85e3f19c6ab2e1152273528544a7858d2265783d

See more details on using hashes here.

File details

Details for the file hoopa-0.1.18-py3-none-any.whl.

File metadata

  • Download URL: hoopa-0.1.18-py3-none-any.whl
  • Upload date:
  • Size: 52.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for hoopa-0.1.18-py3-none-any.whl
Algorithm Hash digest
SHA256 9c86ea135c98881e3b05fffca851fbceb1819b19a73231ccb93784714ecdf6a5
MD5 e2a9db9dcb22875b9d1abbe99a3a64a7
BLAKE2b-256 d8f46f1eb64e617fc7eb364be87345b25adab181b08fea7ca26d73380bc99fb1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page