An extremely customizable web scraper with a modular notification system and persistent storage via SQLite.
Project description
Scrape and Ntfy
An extremely customizable web scraper with a modular notification system and persistent storage via SQLite.
Features
- Modular notification system
- Currently supports Webhooks (e.g. Discord, Slack, etc.) and ntfy.sh
- Web scraping via Selenium
- Simple configuration of multiple scrapers with conditional notifications
Usage
Prerequisites
- A browser
- Most Chromium-based browsers and Firefox-based browsers should work
- Edge is not recommended
- Selenium should also be able to download and cache the appropriate browser if necessary
Basic Configuration
- Configuration for the web scraper is handled through a TOML file
- To see an example configuration, see
config.example.toml - This can be copied to
config.tomland edited to suit your needs - To get the CSS selector for an element, you can use your browser's developer tools (
F12,Ctrl+Shift+I, right-click -> Inspect Element, etc.)- If you're not already in inspect, you can press
Ctrl+Shift+Cto enter inspect element mode (or just click the inspect button in the developer tools) - Click on the element you want to select
- Right-click on the element in the HTML pane
- Click "Copy" -> "Copy selector"
- If you're not already in inspect, you can press
- To see an example configuration, see
- Some other configuration is handled through environment variables and/or command-line arguments (
--helpfor more information)- For example, to set the path to the configuration file, you can set the
PATH_TO_TOMLenvironment variable or use the--path-to-tomlcommand-line argument
- For example, to set the path to the configuration file, you can set the
Docker (Recommended)
Specific perquisites
- Docker
- Docker is a platform for developing, shipping, and running applications in containers
- Docker Compose
Installation and usage
- Clone the repository
git clone https://github.com/slashtechno/scrape-and-ntfy
- Change directory into the repository
cd scrape-and-ntfy
- Configure via
config.toml- Optionally, you can configure some other options via environment variables or command-line arguments in the
docker-compose.ymlfile
- Optionally, you can configure some other options via environment variables or command-line arguments in the
docker compose up -d- The
-dflag runs the containers in the background - If you want, you can run
sqlite-webby uncommenting the appropriate lines indocker-compose.ymlto view the database in a browser on localhost:5050
- The
pip
Specific perquisites
- Python (3.11+)
Installation and usage
- Install with
pippip install scrape-and-ntfy- Depending on your system, you may need to use
pip3instead ofpiporpython3 -m pip/python -m pip.
- Configure
- Run
scrape-and-ntfy- This assumes
pip-installed scripts are in yourPATH
- This assumes
PDM
Specific perquisites
- Python (3.11+)
- PDM
Installation and usage
- Clone the repository
git clone https://github.com/slashtechno/scrape-and-ntfy
- Change directory into the repository
cd scrape-and-ntfy
- Run
pdm install- This will install the dependencies in a virtual environment
- You may need to specify an interpreter with
pdm use
- Configure
pdm run python -m scrape_and_ntfy- This will run the bot with the configuration in
config.toml
- This will run the bot with the configuration in
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
scrape_and_ntfy-0.1.2.tar.gz
(11.4 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scrape_and_ntfy-0.1.2.tar.gz.
File metadata
- Download URL: scrape_and_ntfy-0.1.2.tar.gz
- Upload date:
- Size: 11.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: pdm/2.15.4 CPython/3.12.3 Darwin/21.6.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
15578472ca798cf875f5ef99228da8da140fc4ff8cab4e97f41752533d811ae3
|
|
| MD5 |
b2abdba1e84ea3e048cb4d9b0450158c
|
|
| BLAKE2b-256 |
e2d90b96f202f62516dd529bd76ff45bd2347d1fd8a0456c8c943a9ce2005631
|
File details
Details for the file scrape_and_ntfy-0.1.2-py3-none-any.whl.
File metadata
- Download URL: scrape_and_ntfy-0.1.2-py3-none-any.whl
- Upload date:
- Size: 13.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: pdm/2.15.4 CPython/3.12.3 Darwin/21.6.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
170959c328ec48d5c0d165e01a4d57886c5d7971a73c6cae6d89c90053932d7d
|
|
| MD5 |
c9251141aac330848401386ae8f702b6
|
|
| BLAKE2b-256 |
7407cc623fcfbfc983dc965ee0c69ef57aec0ad6ff11a5de9edfcbd1a47d9739
|