Skip to main content

No project description provided

Project description

CrawLsy-Spider

简介

CrawLsy-Spider 是一个基于 Redis 和 RQ 的爬虫任务管理系统,旨在简化爬虫任务的提交和管理。

安装

  1. 确保已安装 Python 3.9 或更高版本。
  2. 安装依赖库:
pip install crawlsy-spider

使用方法

初始化项目

crawlsy-spider new myproject

task.py 中编写任务逻辑

import requests

def task_func(url):
    return requests.get(url).text

produce.py 提交任务

from crawlsy_spider.craw import CrawLsy

from task import task_func  # 导入test函数

with CrawLsy("tests", is_async=True) as craw:
    result = craw.submit(task_func, 'https://baidu.com')

工作节点部署

python consumers.py

运行生产节点

由于框架是生产消费分离模式,所以在多服务器(集群中启动 worker),此时服务并不能运行,还需要在新启动一个节点用来启动生产服务

python producer.py

开源协议

GNU General Public License v2.0

Crawlsy 是自由软件,您可以根据自由软件基金会发布的 GNU 通用公共许可证(版本 2)的条款重新分发或修改它。

发布此程序是希望它有用,但不提供任何保证;甚至没有对适销性或特定用途适用性的暗示保证。有关更多详细信息,请参阅 GNU 通用公共许可证。

您应该已经收到了 GNU 通用公共许可证的副本;如果没有,请参阅 http://www.gnu.org/licenses/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crawlsy_spider-0.1.2.tar.gz (10.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crawlsy_spider-0.1.2-py3-none-any.whl (11.9 kB view details)

Uploaded Python 3

File details

Details for the file crawlsy_spider-0.1.2.tar.gz.

File metadata

  • Download URL: crawlsy_spider-0.1.2.tar.gz
  • Upload date:
  • Size: 10.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.9.21 Linux/6.8.0-1017-azure

File hashes

Hashes for crawlsy_spider-0.1.2.tar.gz
Algorithm Hash digest
SHA256 5e51b2cf2e95d883d54031c4340e41dcba8d99e26eb0379803855b6088793569
MD5 fcdc103040460a86799c8b93a78d0b94
BLAKE2b-256 d1a0efaad6c7370566f2b9b8ace004ca876f1db6280b7b1feb0935404c104f8a

See more details on using hashes here.

File details

Details for the file crawlsy_spider-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: crawlsy_spider-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 11.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.9.21 Linux/6.8.0-1017-azure

File hashes

Hashes for crawlsy_spider-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 fa8e1747e38626279ddd34467a3fc490c603716cab49e4ad3c412c3289cb9ced
MD5 22932624e9b50d3ab7db42c86e0e9b53
BLAKE2b-256 3c1d0e2a3da1885dbec7445f6910c0a62779dd7a6aec79402e24fb930fc09f9b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page