Skip to main content

No project description provided

Project description

CrawLsy-Spider

简介

CrawLsy-Spider 是一个基于 Redis 和 RQ 的爬虫任务管理系统,旨在简化爬虫任务的提交和管理。

安装

  1. 确保已安装 Python 3.9 或更高版本。
  2. 安装依赖库:
pip install crawlsy-spider

使用方法

初始化项目

crawlsy-spider new myproject

task.py 中编写任务逻辑

import requests

def task_func(url):
    return requests.get(url).text

produce.py 提交任务

from crawlsy_spider.craw import CrawLsy

from task import task_func  # 导入test函数

with CrawLsy("tests", is_async=True) as craw:
    result = craw.submit(task_func, 'https://baidu.com')

工作节点部署

python worker.py

运行生产节点

由于框架是生产消费分离模式,所以在多服务器(集群中启动 worker),此时服务并不能运行,还需要在新启动一个节点用来启动生产服务

python producer.py

开源协议

GNU General Public License v2.0

Crawlsy 是自由软件,您可以根据自由软件基金会发布的 GNU 通用公共许可证(版本 2)的条款重新分发或修改它。

发布此程序是希望它有用,但不提供任何保证;甚至没有对适销性或特定用途适用性的暗示保证。有关更多详细信息,请参阅 GNU 通用公共许可证。

您应该已经收到了 GNU 通用公共许可证的副本;如果没有,请参阅 http://www.gnu.org/licenses/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crawlsy_spider-0.1.1.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crawlsy_spider-0.1.1-py3-none-any.whl (11.8 kB view details)

Uploaded Python 3

File details

Details for the file crawlsy_spider-0.1.1.tar.gz.

File metadata

  • Download URL: crawlsy_spider-0.1.1.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.9.20 Linux/6.8.0-1017-azure

File hashes

Hashes for crawlsy_spider-0.1.1.tar.gz
Algorithm Hash digest
SHA256 8c00a64d7858704104e6989cac61b9d9f3896ac955683073050cd813ff22d27e
MD5 9be613fb4af62ebfb129d6ea1128da5b
BLAKE2b-256 45a782525b39cec983201b0a6928a9bda240ea21d1b33ce711d92a1b0db0a876

See more details on using hashes here.

File details

Details for the file crawlsy_spider-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: crawlsy_spider-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 11.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.9.20 Linux/6.8.0-1017-azure

File hashes

Hashes for crawlsy_spider-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 da34e6a9825eddba55ddefbdfeb55e1ab44fde488e3efbdbe2f6c48e99ea7825
MD5 e6cbec50ad91613570546c68619545e2
BLAKE2b-256 fee57a8f515f1033016454a9fe39c85f5ced630b24ead04132bc205289beee71

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page