魔改使用工具库
Project description
来自
https://github.com/shengchenyang/AyugeSpiderTools/blob/master/docs//docs/intro/install.md
增加个人使用的模板
安装
python 3.8+
可以直接输入以下命令:
pip install gzspidertools
可选安装1,安装数据库相关的所有依赖:
pip install gzspidertools[database]
可选安装2,通过以下命令安装所有依赖:
pip install gzspidertools[all]
注:详细的安装介绍请查看安装指南。
用法
# 查看库版本
gzcmd version
# 创建项目
gzcmd startproject <project_name>
# 进入项目根目录
cd <project_name>
# 替换(覆盖)为真实的配置 .conf 文件:
# 这里是为了演示方便,正常情况是直接在 VIT 中的 .conf 文件填上你需要的配置即可
cp /root/mytemp/.conf DemoSpider/VIT/.conf
# 生成爬虫脚本
gzcmd genspider <spider_name> <example.com>
# 生成 scrapy-redis 爬虫脚本 pip install scrapy_redis-0.7.3-py2.py3-none-any.whl
gzcmd genspider -t=sr <spider_name> <example.com>
# 运行脚本
scrapy crawl <spider_name>
# 注:也可以使用 gzcmd crawl <spider_name>
RedisDB
RedisDB支持哨兵模式、集群模式与单节点的普通模式,封装了操作redis的常用的方法
连接
若环境变量中配置了数据库连接方式或者setting中已配置,则可不传参
普通模式
db = RedisDB(ip_ports="localhost:6379", db=0, user_pass=None)
使用地址连接
db = RedisDB.from_url("redis://[[username]:[password]]@[host]:[port]/[db]")
哨兵模式
db = RedisDB(ip_ports="172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379", db=0, user_pass=None, service_name="my_master")
注意:多个地址用逗号分隔,需传递service_name
对应setting配置文件,配置方式为:
REDISDB_IP_PORTS = "172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379"
REDISDB_USER_PASS = ""
REDISDB_DB = 0
REDISDB_SERVICE_NAME = "my_master"
集群模式
db = RedisDB(ip_ports="172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379", db=0, user_pass=None)
注意:多个地址用逗号分隔,不用传递service_name
对应setting配置文件,配置方式为:
REDISDB_IP_PORTS = "172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379"
REDISDB_USER_PASS = ""
REDISDB_DB = 0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
gzspidertools-0.0.19.tar.gz
(83.4 kB
view details)
Built Distribution
File details
Details for the file gzspidertools-0.0.19.tar.gz
.
File metadata
- Download URL: gzspidertools-0.0.19.tar.gz
- Upload date:
- Size: 83.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.11.5 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 13727e24275537221beb4e3743ebbdf7998e3b3011a3cfb84e9d7c59760c44c0 |
|
MD5 | e05dc4a8183591553a24da5d4ad518d0 |
|
BLAKE2b-256 | abfecba8848df6998fa62bb1868175663cbb9fb5d708e62c5de9a6a3485858cd |
File details
Details for the file gzspidertools-0.0.19-py3-none-any.whl
.
File metadata
- Download URL: gzspidertools-0.0.19-py3-none-any.whl
- Upload date:
- Size: 120.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.11.5 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 837f86cfc7b90cec1c84af4023abfdd54e73e6e4674aae7eececa3744ec90a15 |
|
MD5 | 85c1d0264a5b1c0a36663e447c4f99fc |
|
BLAKE2b-256 | d54f31192b13522ae586cc372534864d800b8f6575b86a222a460591d221e04a |