Skip to main content

A telegram bot to stay tuned on real estate ads

Project description

Scraper Bot

GitHub GitHub Version PyPI - Version GitHub Workflow Status GitHub Workflow Status CodeFactor Grade

This is a bot thought to do periodical scraping of ads from commercial websites.

Found a new ad the bot will send it to you exploiting Apprise channels

Deploy

Pypi

The relative package is available on Pypi

pip install scraper-bot

The package heavily relays on playwright package, so before start to use the bot you have to install a playwright browser

playwright install --with-deps firefox

You can found further information in the playwright documentation (n.b. the bot are not limited to use firefox only)

The scraper-bot package provide the following command to run the bot

scraper-bot

Container

The CI builds the container for each version and it puts it on the public GitHub registry

ghcr.io/robertobochet/scraper-bot

docker compose

  1. Create a telegram bot and retrieve its token
  2. Download config.example.yaml and rename it to config.yaml
  3. Change the configuration follow the guidelines
  4. Download docker-compose.yaml
  5. Start the scraper with docker-compose
    docker-compose up
    
  6. Wait that the bot does its work!

Kubernetes (Helm chart)

For the deploy of the Scraper Bot is also available a helm chart

You can found the source code in the repo scraper-bot-chart

Helm chart package is available in the github OCI registry

oci://ghcr.io/robertobochet/scraper-bot-chart

You can use it to directly deploy on your kubernetes cluster

  1. Retrieve the default values file
    helm show values oci://ghcr.io/robertobochet/scraper-bot-chart > values.yaml
    
  2. Customize the values.yaml
  3. Install the scaper bot
    helm install oci://ghcr.io/robertobochet/scraper-bot-chart scraper-bot -f values.yaml
    

Configuration

By default the bot looks for a configuration file in the following path ./config.y(a)ml and /etc/scaraper-bot/config.y(a)ml. You cna override this behavior passing via command line the --config argument followed by the config file path

scraper-bot --config /path/to/scraper-bot-config.yaml

The configuration file has to satisfy the pydantic model which you can find in scraper_bot.settings. Furthermore you can get the config json schema from command line with --config-schema argument

scraper-bot --config-schema

You can also find a configuration example in config.example.yaml.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scraper_bot-1.3.2.tar.gz (25.7 kB view details)

Uploaded Source

Built Distribution

scraper_bot-1.3.2-py3-none-any.whl (33.0 kB view details)

Uploaded Python 3

File details

Details for the file scraper_bot-1.3.2.tar.gz.

File metadata

  • Download URL: scraper_bot-1.3.2.tar.gz
  • Upload date:
  • Size: 25.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.7

File hashes

Hashes for scraper_bot-1.3.2.tar.gz
Algorithm Hash digest
SHA256 92571ae419387d85f5c77a5e574bc3db674ab7e5fe1008ccb86ed3511a3a314d
MD5 3f0aa68d96b565e5667fb76896d11c7f
BLAKE2b-256 f53ad505e2b1017780a9b8be63d601216fa3713762b35454ac1067903d27acc9

See more details on using hashes here.

File details

Details for the file scraper_bot-1.3.2-py3-none-any.whl.

File metadata

  • Download URL: scraper_bot-1.3.2-py3-none-any.whl
  • Upload date:
  • Size: 33.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.7

File hashes

Hashes for scraper_bot-1.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 91d662ed15effd4e64124bbc4619e5cd6b9532b9346e6e6bdab2d6d46cb2bbe4
MD5 2a2a25f5a8ee07bd0338fbf68f8e79fd
BLAKE2b-256 0fe43548d8d05dcb1afb5bbf25401b53a60e2af4a55745b494ba387e0615038e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page