Skip to main content

Asynchronous crawler micro-framework based on python.

Project description

hoopa

简介

hoopa 是一个轻量、快速的异步分布式爬虫框架

  • 支持内存、redis的优先级队列
  • 支持aiohttp、 httpx、requests等HTTP库
  • 支持断点续传

兼容同步和异步代码,不习惯异步的,可以使用同步写,但是要注意的是不能在异步方法里面进行阻塞的操作

项目还在开发测试中,请勿用于生产环境,若发现问题,欢迎提issue

文档地址:https://fishtn.github.io/hoopa/

环境要求:

  • Python 3.7.0+
  • Works on Linux, Windows, macOS

安装

# For Linux & Mac
pip install -U hoopa[uvloop]

# For Windows
pip install -U hoopa

开始

创建爬虫

hoopa create -s first_spider

然后添加url:http://httpbin.org/get

import hoopa


class FirstSpider(hoopa.Spider):
    name = "first"
    start_urls = ["http://httpbin.org/get"]

    def parse(self, request, response):
        print(response)


if __name__ == "__main__":
    FirstSpider.start()
        

todo

  • 监控平台
  • 远程部署
  • 任务调度

感谢

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hoopa-0.1.10.tar.gz (39.6 kB view details)

Uploaded Source

Built Distribution

hoopa-0.1.10-py3-none-any.whl (52.4 kB view details)

Uploaded Python 3

File details

Details for the file hoopa-0.1.10.tar.gz.

File metadata

  • Download URL: hoopa-0.1.10.tar.gz
  • Upload date:
  • Size: 39.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.12

File hashes

Hashes for hoopa-0.1.10.tar.gz
Algorithm Hash digest
SHA256 125ec7d570d2c346f63ebe2d1b97f73d3d1ae8733dbd3da0ea010489fb39bda7
MD5 07dac29fee83e16dd98300254f548457
BLAKE2b-256 dacccb00959b1ab3a1a11969367f4098f3169ba5ec6c63ac27fe643ed716dfa6

See more details on using hashes here.

File details

Details for the file hoopa-0.1.10-py3-none-any.whl.

File metadata

  • Download URL: hoopa-0.1.10-py3-none-any.whl
  • Upload date:
  • Size: 52.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.9.12

File hashes

Hashes for hoopa-0.1.10-py3-none-any.whl
Algorithm Hash digest
SHA256 600e99ddf508f12f7de37994c10b28324fe7770037492803c2f7558cbdb15f5e
MD5 e987d8c1becb28d0e229f77455263ff6
BLAKE2b-256 4f9590473cf280455e28ece75fb85e1c00465d2d147a2c3da92ff67fd26dda07

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page