An extremely customizable web scraper with a modular notification system and persistent storage via SQLite.
Project description
Scrape and Ntfy
An extremely customizable web scraper with a modular notification system and persistent storage via SQLite.
Features
- Modular notification system
- Currently supports Webhooks (e.g. Discord, Slack, etc.) and ntfy.sh
- Web scraping via Selenium
- Simple configuration of multiple scrapers with conditional notifications
Usage
Prerequisites
- A browser
- Most Chromium-based browsers and Firefox-based browsers should work
- Edge is not recommended
- Selenium should also be able to download and cache the appropriate browser if necessary
Basic Configuration
- Configuration for the web scraper is handled through a TOML file
- To see an example configuration, see
config.example.toml
- This can be copied to
config.toml
and edited to suit your needs - To get the CSS selector for an element, you can use your browser's developer tools (
F12
,Ctrl+Shift+I
, right-click -> Inspect Element, etc.)- If you're not already in inspect, you can press
Ctrl+Shift+C
to enter inspect element mode (or just click the inspect button in the developer tools) - Click on the element you want to select
- Right-click on the element in the HTML pane
- Click "Copy" -> "Copy selector"
- If you're not already in inspect, you can press
- To see an example configuration, see
- Some other configuration is handled through environment variables and/or command-line arguments (
--help
for more information)- For example, to set the path to the configuration file, you can set the
PATH_TO_TOML
environment variable or use the--path-to-toml
command-line argument
- For example, to set the path to the configuration file, you can set the
Docker (Recommended)
Specific perquisites
- Docker
- Docker is a platform for developing, shipping, and running applications in containers
- Docker Compose
Installation and usage
- Clone the repository
git clone https://github.com/slashtechno/scrape-and-ntfy
- Change directory into the repository
cd scrape-and-ntfy
- Configure via
config.toml
- Optionally, you can configure some other options via environment variables or command-line arguments in the
docker-compose.yml
file
- Optionally, you can configure some other options via environment variables or command-line arguments in the
docker compose up -d
- The
-d
flag runs the containers in the background - If you want, you can run
sqlite-web
by uncommenting the appropriate lines indocker-compose.yml
to view the database in a browser on localhost:5050
- The
pip
Specific perquisites
- Python (3.11+)
Installation and usage
- Install with
pip
pip install scrape-and-ntfy
- Depending on your system, you may need to use
pip3
instead ofpip
orpython3 -m pip
/python -m pip
.
- Configure
- Run
scrape-and-ntfy
- This assumes
pip
-installed scripts are in yourPATH
- This assumes
PDM
Specific perquisites
- Python (3.11+)
- PDM
Installation and usage
- Clone the repository
git clone https://github.com/slashtechno/scrape-and-ntfy
- Change directory into the repository
cd scrape-and-ntfy
- Run
pdm install
- This will install the dependencies in a virtual environment
- You may need to specify an interpreter with
pdm use
- Configure
pdm run python -m scrape_and_ntfy
- This will run the bot with the configuration in
config.toml
- This will run the bot with the configuration in
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
scrape_and_ntfy-0.1.2.tar.gz
(11.4 kB
view details)
Built Distribution
File details
Details for the file scrape_and_ntfy-0.1.2.tar.gz
.
File metadata
- Download URL: scrape_and_ntfy-0.1.2.tar.gz
- Upload date:
- Size: 11.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: pdm/2.15.4 CPython/3.12.3 Darwin/21.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 15578472ca798cf875f5ef99228da8da140fc4ff8cab4e97f41752533d811ae3 |
|
MD5 | b2abdba1e84ea3e048cb4d9b0450158c |
|
BLAKE2b-256 | e2d90b96f202f62516dd529bd76ff45bd2347d1fd8a0456c8c943a9ce2005631 |
File details
Details for the file scrape_and_ntfy-0.1.2-py3-none-any.whl
.
File metadata
- Download URL: scrape_and_ntfy-0.1.2-py3-none-any.whl
- Upload date:
- Size: 13.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: pdm/2.15.4 CPython/3.12.3 Darwin/21.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 170959c328ec48d5c0d165e01a4d57886c5d7971a73c6cae6d89c90053932d7d |
|
MD5 | c9251141aac330848401386ae8f702b6 |
|
BLAKE2b-256 | 7407cc623fcfbfc983dc965ee0c69ef57aec0ad6ff11a5de9edfcbd1a47d9739 |