Skip to main content

No project description provided

Project description

CrawLsy-Spider

简介

CrawLsy-Spider 是一个基于 Redis 和 RQ 的爬虫任务管理系统,旨在简化爬虫任务的提交和管理。

安装

  1. 确保已安装 Python 3.12 或更高版本。
  2. 安装依赖库:
pip install crawlsy-spider

使用方法

初始化项目

crawlsy-spider new myproject

task.py 中编写任务逻辑

import requests

def task_func(url):
    return requests.get(url).text

produce.py 提交任务

from crawlsy_spider.craw import CrawLsy

from task import task_func  # 导入test函数

with CrawLsy("tests", is_async=True) as craw:
    result = craw.submit(task_func, 'https://baidu.com')

工作节点部署

python worker.py

运行生产节点

由于框架是生产消费分离模式,所以在多服务器(集群中启动 worker),此时服务并不能运行,还需要在新启动一个节点用来启动生产服务

python producer.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crawlsy_spider-0.1.0.tar.gz (10.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crawlsy_spider-0.1.0-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file crawlsy_spider-0.1.0.tar.gz.

File metadata

  • Download URL: crawlsy_spider-0.1.0.tar.gz
  • Upload date:
  • Size: 10.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.12.8 Linux/6.8.0-1017-azure

File hashes

Hashes for crawlsy_spider-0.1.0.tar.gz
Algorithm Hash digest
SHA256 27003d605fbad1242beb89a5d31b7120360644c3e7c75825988adfe03e6668fc
MD5 f6ca248eaf47829dbf64a2ee38daf9a2
BLAKE2b-256 5fe43da6f88101fa7cdc1a4c7f2d8e3c17d20afba8d41825a3bb2082b9b4a282

See more details on using hashes here.

File details

Details for the file crawlsy_spider-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: crawlsy_spider-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.12.8 Linux/6.8.0-1017-azure

File hashes

Hashes for crawlsy_spider-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7c8745b53656b56902366529f42634f2d755fd54acef617b75c19a6da9ea7230
MD5 ce306a54b9dd9228f558851112c69c06
BLAKE2b-256 09bfa3a640ab13cea97664e0fcd2f71b58b78bf4888ddc14898a974b849952dc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page