Skip to main content

An extremely customizable web scraper with a modular notification system and persistent storage via SQLite.

Project description

Scrape and Ntfy

An extremely customizable web scraper with a modular notification system and persistent storage via SQLite.

Features

  • Modular notification system
    • Currently supports Webhooks (e.g. Discord, Slack, etc.) and ntfy.sh
  • Web scraping via Selenium
  • Simple configuration of multiple scrapers with conditional notifications

Usage

Prerequisites

  • A browser
    • Preferably a Firefox-based browser or some chromium-based browsers
    • Edge is not recommended

Basic Configuration

  • Configuration for the web scraper is handled through a TOML file
    • To see an example configuration, see config.example.toml
    • This can be copied to config.toml and edited to suit your needs
    • To get the CSS selector for an element, you can use your browser's developer tools (F12, Ctrl+Shift+I, right-click -> Inspect Element, etc.)
      1. If you're not already in inspect, you can press Ctrl+Shift+C to enter inspect element mode (or just click the inspect button in the developer tools)
      2. Click on the element you want to select
      3. Right-click on the element in the HTML pane
      4. Click "Copy" -> "Copy selector"
  • Some other configuration is handled through environment variables and/or command-line arguments (--help for more information)
    • For example, to set the path to the configuration file, you can set the PATH_TO_TOML environment variable or use the --path-to-toml command-line argument

Docker (Recommended)

Specific perquisites

  • Docker
    • Docker is a platform for developing, shipping, and running applications in containers
  • Docker Compose

Installation and usage

  1. Clone the repository
    • git clone https://github.com/slashtechno/scrape-and-ntfy
  2. Change directory into the repository
    • cd scrape-and-ntfy
  3. Configure via config.toml
    • Optionally, you can configure some other options via environment variables or command-line arguments
  4. docker compose up -d
    • The -d flag runs the containers in the background
    • If you want, you can run sqlite-web by uncommenting the appropriate lines in docker-compose.yml to view the database in a browser on localhost:5050

PDM

Specific perquisites

  • Python (3.11+)
  • PDM

Installation and usage

  1. Clone the repository
    • git clone https://github.com/slashtechno/scrape-and-ntfy
  2. Change directory into the repository
    • cd scrape-and-ntfy
  3. Configure via config.toml
    • Optionally, you can configure some other options via environment variables or command-line arguments
  4. pdm run python -m scrape_and_ntfy
    • This will run the bot with the configuration in config.toml

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrape_and_ntfy-0.1.0.tar.gz (8.9 kB view hashes)

Uploaded Source

Built Distribution

scrape_and_ntfy-0.1.0-py3-none-any.whl (10.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page