Skip to main content

魔改使用工具库

Project description

来自

https://github.com/shengchenyang/AyugeSpiderTools/blob/master/docs//docs/intro/install.md

增加个人使用的模板

安装

python 3.8+ 可以直接输入以下命令:

pip install gzspidertools

可选安装1,安装数据库相关的所有依赖:

pip install gzspidertools[database]

可选安装2,通过以下命令安装所有依赖:

pip install gzspidertools[all]

注:详细的安装介绍请查看安装指南

用法

# 查看库版本
gzcmd version

# 创建项目
gzcmd startproject <project_name>

# 进入项目根目录
cd <project_name>

# 替换(覆盖)为真实的配置 .conf 文件:
# 这里是为了演示方便,正常情况是直接在 VIT 中的 .conf 文件填上你需要的配置即可
cp /root/mytemp/.conf DemoSpider/VIT/.conf

# 生成爬虫脚本
gzcmd genspider <spider_name> <example.com>

# 生成 scrapy-redis 爬虫脚本   pip install scrapy_redis-0.7.3-py2.py3-none-any.whl
gzcmd genspider -t=sr <spider_name> <example.com>

# 运行脚本
scrapy crawl <spider_name>
# 注:也可以使用 gzcmd crawl <spider_name>

RedisDB

RedisDB支持哨兵模式集群模式与单节点的普通模式,封装了操作redis的常用的方法

连接

若环境变量中配置了数据库连接方式或者setting中已配置,则可不传参

普通模式

db = RedisDB(ip_ports="localhost:6379", db=0, user_pass=None)

使用地址连接

db = RedisDB.from_url("redis://[[username]:[password]]@[host]:[port]/[db]")

哨兵模式

db = RedisDB(ip_ports="172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379", db=0, user_pass=None, service_name="my_master")

注意:多个地址用逗号分隔,需传递service_name

对应setting配置文件,配置方式为:

REDISDB_IP_PORTS = "172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379"
REDISDB_USER_PASS = ""
REDISDB_DB = 0
REDISDB_SERVICE_NAME = "my_master"

集群模式

db = RedisDB(ip_ports="172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379", db=0, user_pass=None)

注意:多个地址用逗号分隔,不用传递service_name

对应setting配置文件,配置方式为:

REDISDB_IP_PORTS = "172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379"
REDISDB_USER_PASS = ""
REDISDB_DB = 0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gzspidertools-0.0.20.tar.gz (82.8 kB view details)

Uploaded Source

Built Distribution

gzspidertools-0.0.20-py3-none-any.whl (121.1 kB view details)

Uploaded Python 3

File details

Details for the file gzspidertools-0.0.20.tar.gz.

File metadata

  • Download URL: gzspidertools-0.0.20.tar.gz
  • Upload date:
  • Size: 82.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.3 Windows/10

File hashes

Hashes for gzspidertools-0.0.20.tar.gz
Algorithm Hash digest
SHA256 a8a35cec6c8376f837a53a037c5f0b0a4016c6dfe0bef775e338a431e12126cc
MD5 856aa9a35deabba8b2b02eccc9871e00
BLAKE2b-256 9be1fa8c0ec1374a210ac062d3002c460b6995ec569c8f1cf4f8a0e7fba30bb7

See more details on using hashes here.

File details

Details for the file gzspidertools-0.0.20-py3-none-any.whl.

File metadata

  • Download URL: gzspidertools-0.0.20-py3-none-any.whl
  • Upload date:
  • Size: 121.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.3 Windows/10

File hashes

Hashes for gzspidertools-0.0.20-py3-none-any.whl
Algorithm Hash digest
SHA256 eebb3bda92cf3176ce68c6fdb302ebe9c609353e6fe829ff8bd857749fa9090c
MD5 9eb641df2188d4f629850b5a01b7a55b
BLAKE2b-256 58bf5a082a8c86802e64800d76f5752873f646581dd91dece3f68fad4af06c20

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page