Skip to main content

爬虫系统业务层封装

Project description

爬虫系统公共类封装

目标

  • redis db 划分
  • url 记录池封装
  • 下载页面池封装
  • 解析结果池封装
  • 网站信息类封装

redis db 划分(0-15)

- 0 -> 常用的队列与有序集或集合
其中的hash表定义
    - website -> 网站信息
    - url_record -> url 记录池
    - url_page -> 下载页面池封装
    - parse_result -> 解析结果池封装
- 10-15 -> 监控自行处理

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qg_spider_sdk-5.2.9.tar.gz (21.4 kB view details)

Uploaded Source

Built Distribution

qg_spider_sdk-5.2.9-py3-none-any.whl (30.3 kB view details)

Uploaded Python 3

File details

Details for the file qg_spider_sdk-5.2.9.tar.gz.

File metadata

  • Download URL: qg_spider_sdk-5.2.9.tar.gz
  • Upload date:
  • Size: 21.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.7.1

File hashes

Hashes for qg_spider_sdk-5.2.9.tar.gz
Algorithm Hash digest
SHA256 b1ebadce5edc6f791e664a3588b31129512b098b31341c9c964603133c7fbe80
MD5 a8a946743af72984ab9bd40aedac8b64
BLAKE2b-256 4e08296fee21a417382ad565b39528659cfd52cc8a8f0e8552a39c9c37cf3f37

See more details on using hashes here.

File details

Details for the file qg_spider_sdk-5.2.9-py3-none-any.whl.

File metadata

  • Download URL: qg_spider_sdk-5.2.9-py3-none-any.whl
  • Upload date:
  • Size: 30.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.7.1

File hashes

Hashes for qg_spider_sdk-5.2.9-py3-none-any.whl
Algorithm Hash digest
SHA256 53a2d834e69af97d51da90c0ab7b40d1def97d6a803630a9c38117df4fdbd8d8
MD5 c79c20e6539b525c55f4d8b13c8b4568
BLAKE2b-256 18d9e474a91092cd9c151db0f4a13f21bc66d7a37b28c92eca2e9c1e11066c4b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page