Skip to main content

魔改使用工具库

Project description

来自

https://github.com/shengchenyang/AyugeSpiderTools/blob/master/docs//docs/intro/install.md

增加个人使用的模板

安装

python 3.8+ 可以直接输入以下命令:

pip install gzspidertools

可选安装1,安装数据库相关的所有依赖:

pip install gzspidertools[database]

可选安装2,通过以下命令安装所有依赖:

pip install gzspidertools[all]

注:详细的安装介绍请查看安装指南

用法

# 查看库版本
gzcmd version

# 创建项目
gzcmd startproject <project_name>

# 进入项目根目录
cd <project_name>

# 替换(覆盖)为真实的配置 .conf 文件:
# 这里是为了演示方便,正常情况是直接在 VIT 中的 .conf 文件填上你需要的配置即可
cp /root/mytemp/.conf DemoSpider/VIT/.conf

# 生成爬虫脚本
gzcmd genspider <spider_name> <example.com>

# 生成 scrapy-redis 爬虫脚本   pip install scrapy_redis-0.7.3-py2.py3-none-any.whl
gzcmd genspider -t=sr <spider_name> <example.com>

# 运行脚本
scrapy crawl <spider_name>
# 注:也可以使用 gzcmd crawl <spider_name>

RedisDB

RedisDB支持哨兵模式集群模式与单节点的普通模式,封装了操作redis的常用的方法

连接

若环境变量中配置了数据库连接方式或者setting中已配置,则可不传参

普通模式

db = RedisDB(ip_ports="localhost:6379", db=0, user_pass=None)

使用地址连接

db = RedisDB.from_url("redis://[[username]:[password]]@[host]:[port]/[db]")

哨兵模式

db = RedisDB(ip_ports="172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379", db=0, user_pass=None, service_name="my_master")

注意:多个地址用逗号分隔,需传递service_name

对应setting配置文件,配置方式为:

REDISDB_IP_PORTS = "172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379"
REDISDB_USER_PASS = ""
REDISDB_DB = 0
REDISDB_SERVICE_NAME = "my_master"

集群模式

db = RedisDB(ip_ports="172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379", db=0, user_pass=None)

注意:多个地址用逗号分隔,不用传递service_name

对应setting配置文件,配置方式为:

REDISDB_IP_PORTS = "172.25.21.4:26379,172.25.21.5:26379,172.25.21.6:26379"
REDISDB_USER_PASS = ""
REDISDB_DB = 0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gzspidertools-0.0.14.tar.gz (82.5 kB view details)

Uploaded Source

Built Distribution

gzspidertools-0.0.14-py3-none-any.whl (119.5 kB view details)

Uploaded Python 3

File details

Details for the file gzspidertools-0.0.14.tar.gz.

File metadata

  • Download URL: gzspidertools-0.0.14.tar.gz
  • Upload date:
  • Size: 82.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.5 Windows/10

File hashes

Hashes for gzspidertools-0.0.14.tar.gz
Algorithm Hash digest
SHA256 b262b0384c683f9985847a0fb38d5f7aceb5854854b61edcf770d030caa7d7e9
MD5 67447bae5dd3b1a886ffd2e5e7f13ce0
BLAKE2b-256 165a4f71ca4637e7255883a6a69ec8e74065cec1d5f0456d27b32c2c076edd9b

See more details on using hashes here.

File details

Details for the file gzspidertools-0.0.14-py3-none-any.whl.

File metadata

  • Download URL: gzspidertools-0.0.14-py3-none-any.whl
  • Upload date:
  • Size: 119.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.5 Windows/10

File hashes

Hashes for gzspidertools-0.0.14-py3-none-any.whl
Algorithm Hash digest
SHA256 8b41d334f65d3ff443b565ea376d45ab85baa3e1ab33ed6a86eb7c02f9fdd85b
MD5 291723ca7f5d4d69c2b93444a4859bdb
BLAKE2b-256 440eba55a9e71fcd795af72b836f1309f197c331bfb1eeb11865c4990e41e17e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page