简易、好用的爬虫工具,减少重复代码与文件冗余
Project description
easy spider tool
在实际工作中,沉淀的一些简易、好用的爬虫工具,减少重复代码与文件冗余,希望一样能为使用者带来益处。如果您也想贡献好的代码片段,请将代码以及描述,通过邮箱( xinkonghan@gmail.com )发送给我。代码格式是遵循自我主观,如存在不足敬请指出!
安装
pip install easy_spider_tool
主要功能
时间相关before_day昨天日期(可用于时间递减)after_day明天日期(可用于时间递增)between_day两个日期之间current_date当前时间timestamp当前时间戳(支持精确到毫秒)date_parse任意格式时间解析(支持时区转换,指定保留日期/时间(可设置默认值)部分)
json相关format_json漂亮美观的格式化输出jsonpath任意多个json路径解析(支持设置默认值,选取首个匹配值)
hash摘要相关md5字符经md5编码
正则匹配相关regex_match条件匹配(支持多个不相关条件匹配,支持设置默认值,选取首个匹配值)for_to_regx_match多个不相关条件匹配(兼容老版本保留)
数据清洗/转换相关cookie_to_diccookie转换为字典(Dict)格式clear_value清除列表(List)或字典(Dict)中的指定值(递归清除所有嵌套字典和列表中的指定值)
合法性验证相关verify_ip_addressIP地址合法性验证verify_domain_name域名合法性验证verify_port端口合法性验证verify_urlURL合法性验证
通知相关- 暂无
简单使用
from easy_spider_tool import format_json, jsonpath
data = {
"code": 200,
"data": [
{
"id": 1,
"username": "admin",
"level": "boss"
},
{
"id": 2,
"username": "user",
"level": "staff"
}
]
}
boss_name = jsonpath(data, '$.data[?(@.level=="boss")].username', first=True)
all_user_info = jsonpath(data, '$.data[*].username')
print(boss_name)
print(format_json(all_user_info))
链接
Github:https://github.com/hanxinkong/easy-spider-tool
在线文档:https://easy-spider-tool.xink.top/
注明
该工具借鉴作者【xingcweb】,根据主观新增部分功能
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
easy_spider_tool-1.0.16.tar.gz
(12.0 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file easy_spider_tool-1.0.16.tar.gz.
File metadata
- Download URL: easy_spider_tool-1.0.16.tar.gz
- Upload date:
- Size: 12.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
51b1561bff13c8706af99e60da6b847844dc80755af1e9ed9aef0b126712217f
|
|
| MD5 |
36ff87f57a700416b57c06275855e7b9
|
|
| BLAKE2b-256 |
9e8afd38d87f3b11713e6202b58e14ced3baa8a206b65a330ac43c6ca2990efe
|
File details
Details for the file easy_spider_tool-1.0.16-py3-none-any.whl.
File metadata
- Download URL: easy_spider_tool-1.0.16-py3-none-any.whl
- Upload date:
- Size: 13.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
23b460e777d41b1767135d89f455dbb18b41456fd02b134b106419f89fe8b889
|
|
| MD5 |
fbdd546c5d6bc9dc405e85df65e9e082
|
|
| BLAKE2b-256 |
e040023ea7750e28c69e61498bf35b8880eceb34f36db5b1eee83c6faebd484b
|