魔改使用工具库
Project description
来自
https://github.com/shengchenyang/AyugeSpiderTools/blob/master/docs//docs/intro/install.md
增加个人使用的模板
安装
python 3.8+
可以直接输入以下命令:
pip install gzspidertools
可选安装1,安装数据库相关的所有依赖:
pip install gzspidertools[database]
可选安装2,通过以下命令安装所有依赖:
pip install gzspidertools[all]
注:详细的安装介绍请查看安装指南。
用法
# 查看库版本
gzcmd version
# 创建项目
gzcmd startproject <project_name>
# 进入项目根目录
cd <project_name>
# 替换(覆盖)为真实的配置 .conf 文件:
# 这里是为了演示方便,正常情况是直接在 VIT 中的 .conf 文件填上你需要的配置即可
cp /root/mytemp/.conf DemoSpider/VIT/.conf
# 生成爬虫脚本
gzcmd genspider <spider_name> <example.com>
# 生成 scrapy-redis 爬虫脚本 pip install scrapy_redis-0.7.3-py2.py3-none-any.whl
gzcmd genspider -t=sr <spider_name> <example.com>
# 运行脚本
scrapy crawl <spider_name>
# 注:也可以使用 gzcmd crawl <spider_name>
RedisDB
RedisDB支持哨兵模式、集群模式与单节点的普通模式,封装了操作redis的常用的方法
连接
若环境变量中配置了数据库连接方式或者setting中已配置,则可不传参
普通模式
db = RedisDB(ip_ports="localhost:6379", db=0, user_pass=None)
使用地址连接
db = RedisDB.from_url("redis://[[username]:[password]]@[host]:[port]/[db]")
哨兵模式
db = RedisDB(ip_ports="172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379", db=0, user_pass=None, service_name="my_master")
注意:多个地址用逗号分隔,需传递service_name
对应setting配置文件,配置方式为:
REDISDB_IP_PORTS = "172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379"
REDISDB_USER_PASS = ""
REDISDB_DB = 0
REDISDB_SERVICE_NAME = "my_master"
集群模式
db = RedisDB(ip_ports="172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379", db=0, user_pass=None)
注意:多个地址用逗号分隔,不用传递service_name
对应setting配置文件,配置方式为:
REDISDB_IP_PORTS = "172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379"
REDISDB_USER_PASS = ""
REDISDB_DB = 0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
gzspidertools-0.0.20.tar.gz
(82.8 kB
view details)
Built Distribution
File details
Details for the file gzspidertools-0.0.20.tar.gz
.
File metadata
- Download URL: gzspidertools-0.0.20.tar.gz
- Upload date:
- Size: 82.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.3 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a8a35cec6c8376f837a53a037c5f0b0a4016c6dfe0bef775e338a431e12126cc |
|
MD5 | 856aa9a35deabba8b2b02eccc9871e00 |
|
BLAKE2b-256 | 9be1fa8c0ec1374a210ac062d3002c460b6995ec569c8f1cf4f8a0e7fba30bb7 |
File details
Details for the file gzspidertools-0.0.20-py3-none-any.whl
.
File metadata
- Download URL: gzspidertools-0.0.20-py3-none-any.whl
- Upload date:
- Size: 121.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.11.3 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | eebb3bda92cf3176ce68c6fdb302ebe9c609353e6fe829ff8bd857749fa9090c |
|
MD5 | 9eb641df2188d4f629850b5a01b7a55b |
|
BLAKE2b-256 | 58bf5a082a8c86802e64800d76f5752873f646581dd91dece3f68fad4af06c20 |