魔改使用工具库
Project description
来自
https://github.com/shengchenyang/AyugeSpiderTools/blob/master/docs//docs/intro/install.md
增加个人使用的模板
安装
python 3.8+
可以直接输入以下命令:
pip install gzspidertools
可选安装1,安装数据库相关的所有依赖:
pip install gzspidertools[database]
可选安装2,通过以下命令安装所有依赖:
pip install gzspidertools[all]
注:详细的安装介绍请查看安装指南。
用法
# 查看库版本
gzcmd version
# 创建项目
gzcmd startproject <project_name>
# 进入项目根目录
cd <project_name>
# 替换(覆盖)为真实的配置 .conf 文件:
# 这里是为了演示方便,正常情况是直接在 VIT 中的 .conf 文件填上你需要的配置即可
cp /root/mytemp/.conf DemoSpider/VIT/.conf
# 生成爬虫脚本
gzcmd genspider <spider_name> <example.com>
# 生成 scrapy-redis 爬虫脚本 pip install scrapy_redis-0.7.3-py2.py3-none-any.whl
gzcmd genspider -t=sr <spider_name> <example.com>
# 运行脚本
scrapy crawl <spider_name>
# 注:也可以使用 gzcmd crawl <spider_name>
RedisDB
RedisDB支持哨兵模式、集群模式与单节点的普通模式,封装了操作redis的常用的方法
连接
若环境变量中配置了数据库连接方式或者setting中已配置,则可不传参
普通模式
db = RedisDB(ip_ports="localhost:6379", db=0, user_pass=None)
使用地址连接
db = RedisDB.from_url("redis://[[username]:[password]]@[host]:[port]/[db]")
哨兵模式
db = RedisDB(ip_ports="172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379", db=0, user_pass=None, service_name="my_master")
注意:多个地址用逗号分隔,需传递service_name
对应setting配置文件,配置方式为:
REDISDB_IP_PORTS = "172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379"
REDISDB_USER_PASS = ""
REDISDB_DB = 0
REDISDB_SERVICE_NAME = "my_master"
集群模式
db = RedisDB(ip_ports="172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379", db=0, user_pass=None)
注意:多个地址用逗号分隔,不用传递service_name
对应setting配置文件,配置方式为:
REDISDB_IP_PORTS = "172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379"
REDISDB_USER_PASS = ""
REDISDB_DB = 0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
gzspidertools-0.0.21.tar.gz
(87.1 kB
view details)
Built Distribution
File details
Details for the file gzspidertools-0.0.21.tar.gz
.
File metadata
- Download URL: gzspidertools-0.0.21.tar.gz
- Upload date:
- Size: 87.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.3 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e95affd95b5396b0d5a38dde3c35d4c475e46216c11b491c43fc45969fc3ae98 |
|
MD5 | e88e183478d047097e949f0b87fb5a50 |
|
BLAKE2b-256 | 317cdd848a9532212d5b9797d29f04d263c7e2108f73366dcbaa6df1f56ad803 |
File details
Details for the file gzspidertools-0.0.21-py3-none-any.whl
.
File metadata
- Download URL: gzspidertools-0.0.21-py3-none-any.whl
- Upload date:
- Size: 127.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.3 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 50013eaa1b06f64fcc8c3965c9fe4a9a80f2ec6a21f4b0b679d84c1a4aca5069 |
|
MD5 | 18ba9d326cc5ad5ce77d6b800dcec1a0 |
|
BLAKE2b-256 | 4adee42eb215bea5165a84e48fd3950213cb2ede38e40459f46568ac0a9833d0 |