watchdog toolkit for check web change.
Project description
onwebchange
- Default Console Web UI.
- RSS support.
- Release on pypi.
- Add tag filter, to distinguish all the RSS sites. Add multiple tags support.
- Add .pyz usage for fast deploying.
- Support one-key sub RSS
Install
> pip3 install onwebchange -U
> python3 -m onwebchange -f wc.config -i 300 --host=127.0.0.1 -p 8080 --username=admin --password=admin
or shiv as one file "onwebchange.pyz", for fast deploying
> pip3 install shiv -U
> shiv -o onwebchange.pyz -e onwebchange.__main__:main --compressed onwebchange
> python3.7 onwebchange.pyz --username=admin --password=admin
Requirements
torequests click bottle objectpath beautifulsoup4
Quick start
- install
python3 -m onwebchange
- add shell command to systemd / supervisor.
- Run with username & password.
python3 -m onwebchange -f wc.config -i 300 --host=127.0.0.1 -p 8080 --username=admin --password=admin
- Add Tasks
-
Press [AddTask] button
-
Fill the blank:
name: "pypi trending projects no1"
request_args: "https://pypi.org/"
parser_name: "css"
operation: "#content > div:nth-child(4) > div > div:nth-child(1) > ul > li:nth-child(1) > a > h3 > span.package-snippet__name"
value: "$text"
check_interval: 300
max_change: 10
-
Press [Update Task] button
-
Subscribe RSS from chrome RSS reader extension
-
Default Web UI
Example
run as main package with command
python3 -m onwebchange -f wc.config -i 300 -a
or
from onwebchange.core import WebHandler
from onwebchange.webui import app
if __name__ == "__main__":
wh = WebHandler(
app,
file_path=None,
loop_interval=300,
auto_open_browser=True,
change_callback=lambda task: print(task.name),
app_kwargs={'port': 9988})
# python3 -m onwebchange -f wc.config -i 300 -a
wh.run()
Parser examples
-
regex
- parser_name: re
- operation: class="(.*?)"
- value: $1
-
css selector for attribute
- parser_name: css
- operation: #J_all_item_910789
- value: @class
- value also can be:
- $string
- list of outer HTML
- $text
- list of node.text
- $get_text
- list of node.get_text()
- $string
- value also can be:
-
json (ObjectPath).
-
with json-handle chrome extention.
-
parser_name: json
-
operation: $.headers["Accept-Encoding"]
-
value: $text
-
-
python
-
parser_name: python
-
operation:
-
def parse(resp): return resp.text[:10]
-
-
value as null
-
New Task template
"name": "task name0",
"request_args": "https://pypi.org", # could be url, curl string, request args dict.
"parser_name": "css", # could be re/css/json/python
"operation": ".lede-paragraph",
"value": "$text",
"check_interval": 300,
"max_change": 2,
"sorting_list": true,
"origin_url": "https://pypi.org",
"encoding": null
{
"name": "task name1",
"request_args": "https://pypi.org",
"parser_name": "re",
"operation": "class=\"(lede-paragraph)\"",
"value": "$1",
"check_interval": 300,
"max_change": 2,
"sorting_list": true,
"origin_url": "",
"encoding": null
}
{
"name": "task name2",
"request_args": "http://httpbin.org/get",
"parser_name": "json",
"operation": "$.url",
"value": "",
"check_interval": 300,
"sorting_list": true,
"origin_url": "",
"encoding": null
}
More docs
Watchdog task.
:param name: Task name.
:type name: str
:param request_args: arg for sending a request, could be url/curl_string/dict.
:type request_args: dict / str
:param parser_name: re, css, json, python, defaults to None, use the resp.text.
:type parser_name: str, optional
:param operation: parse operation for the parser_name, defaults to None
:type operation: str, optional
:param value: value operation for the parser, defaults to None
:type value: str, optional
:param sorting_list: whether sorting the list of result from `css or other parsers`, defaults to True
:type sorting_list: bool, optional
:param check_interval: check_interval, defaults to 60 seconds
:type check_interval: int, optional
:param max_change: save result in check_result_list, save the latest 2 change, defaults to 2
:type max_change: list, optional
:param check_result_list: latest `max_change` checking result, usually use md5 to shorten it, defaults to None
:type check_result_list: list, optional
:param origin_url: the url to see the changement, defaults to request_args['url']
:type origin_url: str, optional
request_args examples:
url:
http://pypi.org
args:
{'url': 'http://pypi.org', 'method': 'get'}
curl:
curl 'https://pypi.org/' -H 'authority: pypi.org' -H 'cache-control: max-age=0' -H 'upgrade-insecure-requests: 1' -H 'user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36' -H 'sec-fetch-mode: navigate' -H 'sec-fetch-user: ?1' -H 'dnt: 1' -H 'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3' -H 'sec-fetch-site: none' -H 'accept-encoding: gzip, deflate, br' -H 'accept-language: zh-CN,zh;q=0.9' -H 'cookie: user_id__insecure=; session_id=' --compressed
parser examples:
re:
operation = '.*?abc'
value = '$0' (or '$1', `$` means the group index for regex result)
css:
operation = ".className"
value = '$string'
$string: return [node] as outer html
$text: return [node.text]
$get_text: return [node.get_text()]
@attr: [get attribute from node]
json:
view more: https://github.com/adriank/ObjectPath
# input response JSON string: {"a": 1}
operation = "$.a"
value = None
python:
! function name should always be `parse` if value is None,
or use `value` as the function name.
`operation can be a function object.`
operation = lambda resp: resp.text
operation = r'''
def parse(resp):
return md5(resp.text)
'''
value = None
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file onwebchange-0.3.4-py3-none-any.whl
.
File metadata
- Download URL: onwebchange-0.3.4-py3-none-any.whl
- Upload date:
- Size: 28.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c3aa8ea36f9e2c982f208cc416ac343f8d45c44ce0b1014e540f6398d4563cc7 |
|
MD5 | 18cd8f57e05de596a62a872d0d6b04dc |
|
BLAKE2b-256 | c5a10cbf135b262e6613129dd598ca04d3098ba61def78d0e68c2c7f0eb7cb8c |