Skip to main content

An extremely customizable web scraper with a modular notification system and persistent storage via SQLite.

Project description

Scrape and Ntfy

An extremely customizable web scraper with a modular notification system and persistent storage via SQLite.

Features

  • Modular notification system
    • Currently supports Webhooks (e.g. Discord, Slack, etc.) and ntfy.sh
  • Web scraping via Selenium
  • Simple configuration of multiple scrapers with conditional notifications

Usage

Prerequisites

  • A browser
    • Most Chromium-based browsers and Firefox-based browsers should work
    • Edge is not recommended
    • Selenium should also be able to download and cache the appropriate browser if necessary

Basic Configuration

  • Configuration for the web scraper is handled through a TOML file
    • To see an example configuration, see config.example.toml
    • This can be copied to config.toml and edited to suit your needs
    • To get the CSS selector for an element, you can use your browser's developer tools (F12, Ctrl+Shift+I, right-click -> Inspect Element, etc.)
      1. If you're not already in inspect, you can press Ctrl+Shift+C to enter inspect element mode (or just click the inspect button in the developer tools)
      2. Click on the element you want to select
      3. Right-click on the element in the HTML pane
      4. Click "Copy" -> "Copy selector"
  • Some other configuration is handled through environment variables and/or command-line arguments (--help for more information)
    • For example, to set the path to the configuration file, you can set the PATH_TO_TOML environment variable or use the --path-to-toml command-line argument

Docker (Recommended)

Specific perquisites

  • Docker
    • Docker is a platform for developing, shipping, and running applications in containers
  • Docker Compose

Installation and usage

  1. Clone the repository
    • git clone https://github.com/slashtechno/scrape-and-ntfy
  2. Change directory into the repository
    • cd scrape-and-ntfy
  3. Configure via config.toml
    • Optionally, you can configure some other options via environment variables or command-line arguments in the docker-compose.yml file
  4. docker compose up -d
    • The -d flag runs the containers in the background
    • If you want, you can run sqlite-web by uncommenting the appropriate lines in docker-compose.yml to view the database in a browser on localhost:5050

pip

Specific perquisites

  • Python (3.11+)

Installation and usage

  1. Install with pip
    • pip install scrape-and-ntfy
    • Depending on your system, you may need to use pip3 instead of pip or python3 -m pip/python -m pip.
  2. Configure
  3. Run scrape-and-ntfy
    • This assumes pip-installed scripts are in your PATH

PDM

Specific perquisites

  • Python (3.11+)
  • PDM

Installation and usage

  1. Clone the repository
    • git clone https://github.com/slashtechno/scrape-and-ntfy
  2. Change directory into the repository
    • cd scrape-and-ntfy
  3. Run pdm install
    • This will install the dependencies in a virtual environment
    • You may need to specify an interpreter with pdm use
  4. Configure
  5. pdm run python -m scrape_and_ntfy
    • This will run the bot with the configuration in config.toml

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrape_and_ntfy-0.1.2.tar.gz (11.4 kB view details)

Uploaded Source

Built Distribution

scrape_and_ntfy-0.1.2-py3-none-any.whl (13.1 kB view details)

Uploaded Python 3

File details

Details for the file scrape_and_ntfy-0.1.2.tar.gz.

File metadata

  • Download URL: scrape_and_ntfy-0.1.2.tar.gz
  • Upload date:
  • Size: 11.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.15.4 CPython/3.12.3 Darwin/21.6.0

File hashes

Hashes for scrape_and_ntfy-0.1.2.tar.gz
Algorithm Hash digest
SHA256 15578472ca798cf875f5ef99228da8da140fc4ff8cab4e97f41752533d811ae3
MD5 b2abdba1e84ea3e048cb4d9b0450158c
BLAKE2b-256 e2d90b96f202f62516dd529bd76ff45bd2347d1fd8a0456c8c943a9ce2005631

See more details on using hashes here.

File details

Details for the file scrape_and_ntfy-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: scrape_and_ntfy-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 13.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.15.4 CPython/3.12.3 Darwin/21.6.0

File hashes

Hashes for scrape_and_ntfy-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 170959c328ec48d5c0d165e01a4d57886c5d7971a73c6cae6d89c90053932d7d
MD5 c9251141aac330848401386ae8f702b6
BLAKE2b-256 7407cc623fcfbfc983dc965ee0c69ef57aec0ad6ff11a5de9edfcbd1a47d9739

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page