Skip to main content

Python web crawler / scraper for WG-Gesucht. Crawls the WG-Gesucht site for new apartment listings and send a message to the poster, based off your saved filters and saved text template.

Project description

https://img.shields.io/travis/grantwilliams/wg-gesucht-crawler-cli.svg https://img.shields.io/pypi/v/wg-gesucht-crawler-cli.svg Documentation Status

Python web crawler / scraper for WG-Gesucht. Crawls the WG-Gesucht site for new apartment listings and send a message to the poster, based off your saved filters and saved text template.

Installation

$ pip install wg-gesucht-crawler-cli

Or, if you have virtualenvwrapper installed:

$ mkvirtualenv wg-gesucht-crawler-cli
$ pip install wg-gesucht-crawler-cli

Use

Can be run directly from the command line with:

$ wg-gesucht-crawler-cli --help

Or if you want to use it in your own project:

from wg_gesucht.crawler import WgGesuchtCrawler

Just make sure to save at least one search filter as well as a template text on your wg-gesucht account.

Features

  • Searches https://wg-gesucht.de for new WG ads based off your saved filters

  • Sends your saved template message and applies to all matching listings

  • Reruns every ~5 minutes

  • Run on a RPi or free EC2 micro instance 24/7 to always be one of the first to apply for new listings

Getting Caught with reCAPTCHA

I’ve made the crawler sleep for 5-8 seconds between each request to try and avoid their reCAPTCHA, but if the crawler does get caught, you can sign into your wg-gesucht account manually through the browser and solve the reCAPTCHA, then start the crawler again. If it continues to happen, you can also increase the sleep time in the get_page() function in wg_gesucht.py

History

Pre-release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wg-gesucht-crawler-cli-0.2.1.tar.gz (33.9 kB view details)

Uploaded Source

File details

Details for the file wg-gesucht-crawler-cli-0.2.1.tar.gz.

File metadata

  • Download URL: wg-gesucht-crawler-cli-0.2.1.tar.gz
  • Upload date:
  • Size: 33.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.6.7

File hashes

Hashes for wg-gesucht-crawler-cli-0.2.1.tar.gz
Algorithm Hash digest
SHA256 2a3be453b1ff8ae9660808c1ec28768ac3d5d66cacf490445f30cefcfc6dbab6
MD5 c35f3d3d72d613cf3bb81c1f1fe71f3f
BLAKE2b-256 3ac7a5177f8c5cb5fa73e76a95d1709baf2738fdfbe53926496ffe80fcfa60d2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page