Skip to main content

魔改使用工具库

Project description

来自

https://github.com/shengchenyang/AyugeSpiderTools/blob/master/docs//docs/intro/install.md

增加个人使用的模板

安装

python 3.8+ 可以直接输入以下命令:

pip install gzspidertools

可选安装1,安装数据库相关的所有依赖:

pip install gzspidertools[database]

可选安装2,通过以下命令安装所有依赖:

pip install gzspidertools[all]

注:详细的安装介绍请查看安装指南

用法

# 查看库版本
gzcmd version

# 创建项目
gzcmd startproject <project_name>

# 进入项目根目录
cd <project_name>

# 替换(覆盖)为真实的配置 .conf 文件:
# 这里是为了演示方便,正常情况是直接在 VIT 中的 .conf 文件填上你需要的配置即可
cp /root/mytemp/.conf DemoSpider/VIT/.conf

# 生成爬虫脚本
gzcmd genspider <spider_name> <example.com>

# 生成 scrapy-redis 爬虫脚本   pip install scrapy_redis-0.7.3-py2.py3-none-any.whl
gzcmd genspider -t=sr <spider_name> <example.com>

# 运行脚本
scrapy crawl <spider_name>
# 注:也可以使用 gzcmd crawl <spider_name>

RedisDB

RedisDB支持哨兵模式集群模式与单节点的普通模式,封装了操作redis的常用的方法

连接

若环境变量中配置了数据库连接方式或者setting中已配置,则可不传参

普通模式

db = RedisDB(ip_ports="localhost:6379", db=0, user_pass=None)

使用地址连接

db = RedisDB.from_url("redis://[[username]:[password]]@[host]:[port]/[db]")

哨兵模式

db = RedisDB(ip_ports="172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379", db=0, user_pass=None, service_name="my_master")

注意:多个地址用逗号分隔,需传递service_name

对应setting配置文件,配置方式为:

REDISDB_IP_PORTS = "172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379"
REDISDB_USER_PASS = ""
REDISDB_DB = 0
REDISDB_SERVICE_NAME = "my_master"

集群模式

db = RedisDB(ip_ports="172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379", db=0, user_pass=None)

注意:多个地址用逗号分隔,不用传递service_name

对应setting配置文件,配置方式为:

REDISDB_IP_PORTS = "172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379"
REDISDB_USER_PASS = ""
REDISDB_DB = 0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gzspidertools-0.0.21.tar.gz (87.1 kB view details)

Uploaded Source

Built Distribution

gzspidertools-0.0.21-py3-none-any.whl (127.0 kB view details)

Uploaded Python 3

File details

Details for the file gzspidertools-0.0.21.tar.gz.

File metadata

  • Download URL: gzspidertools-0.0.21.tar.gz
  • Upload date:
  • Size: 87.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.3 Windows/10

File hashes

Hashes for gzspidertools-0.0.21.tar.gz
Algorithm Hash digest
SHA256 e95affd95b5396b0d5a38dde3c35d4c475e46216c11b491c43fc45969fc3ae98
MD5 e88e183478d047097e949f0b87fb5a50
BLAKE2b-256 317cdd848a9532212d5b9797d29f04d263c7e2108f73366dcbaa6df1f56ad803

See more details on using hashes here.

File details

Details for the file gzspidertools-0.0.21-py3-none-any.whl.

File metadata

  • Download URL: gzspidertools-0.0.21-py3-none-any.whl
  • Upload date:
  • Size: 127.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.3 Windows/10

File hashes

Hashes for gzspidertools-0.0.21-py3-none-any.whl
Algorithm Hash digest
SHA256 50013eaa1b06f64fcc8c3965c9fe4a9a80f2ec6a21f4b0b679d84c1a4aca5069
MD5 18ba9d326cc5ad5ce77d6b800dcec1a0
BLAKE2b-256 4adee42eb215bea5165a84e48fd3950213cb2ede38e40459f46568ac0a9833d0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page