Skip to main content

Python web crawler / scraper for WG-Gesucht. Crawls the WG-Gesucht site for new apartment listings and send a message to the poster, based off your saved filters and saved text template.

Project description

https://img.shields.io/travis/grantwilliams/wg-gesucht-crawler-cli.svg https://img.shields.io/pypi/v/wg-gesucht-crawler-cli.svg Documentation Status

Python web crawler / scraper for WG-Gesucht. Crawls the WG-Gesucht site for new apartment listings and send a message to the poster, based off your saved filters and saved text template.

Installation

$ pip install wg-gesucht-crawler-cli

Or, if you have virtualenvwrapper installed:

$ mkvirtualenv wg-gesucht-crawler-cli
$ pip install wg-gesucht-crawler-cli

Use

Can be run directly from the command line with:

$ wg-gesucht-crawler-cli --help

Or if you want to use it in your own project:

from wg_gesucht.crawler import WgGesuchtCrawler

Just make sure to save at least one search filter as well as a template text on your wg-gesucht account.

Features

  • Searches https://wg-gesucht.de for new WG ads based off your saved filters

  • Sends your saved template message and applies to all matching listings

  • Reruns every ~5 minutes

  • Run on a RPi or free EC2 micro instance 24/7 to always be one of the first to apply for new listings

Getting Caught with reCAPTCHA

I’ve made the crawler sleep for 5-8 seconds between each request to try and avoid their reCAPTCHA, but if the crawler does get caught, you can sign into your wg-gesucht account manually through the browser and solve the reCAPTCHA, then start the crawler again. If it continues to happen, you can also increase the sleep time in the get_page() function in wg_gesucht.py

History

Pre-release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wg-gesucht-crawler-cli-0.1.7.tar.gz (33.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wg_gesucht_crawler_cli-0.1.7-py2.py3-none-any.whl (12.0 kB view details)

Uploaded Python 2Python 3

File details

Details for the file wg-gesucht-crawler-cli-0.1.7.tar.gz.

File metadata

  • Download URL: wg-gesucht-crawler-cli-0.1.7.tar.gz
  • Upload date:
  • Size: 33.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.6.7

File hashes

Hashes for wg-gesucht-crawler-cli-0.1.7.tar.gz
Algorithm Hash digest
SHA256 924dbb4f5f188d8300e29a11c84435a67eb7f02b6568a33ac8fc06b7d174a9e3
MD5 0c0d1768cf9d833243bd40c1e8a297d5
BLAKE2b-256 8b391db2416f849899374e98319f0ee83d47d43007d4ccdc1418521cf1811872

See more details on using hashes here.

File details

Details for the file wg_gesucht_crawler_cli-0.1.7-py2.py3-none-any.whl.

File metadata

  • Download URL: wg_gesucht_crawler_cli-0.1.7-py2.py3-none-any.whl
  • Upload date:
  • Size: 12.0 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.6.7

File hashes

Hashes for wg_gesucht_crawler_cli-0.1.7-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 f657cc9ebee4b45440c9b63fb827f9d4adbb261d5e3fe4770307b4655e4e4b9f
MD5 4698972da5d9ddee7fc1c31ed4ef7a12
BLAKE2b-256 144951c2de892356617cfe05ec52cedd805ef5fb8ef817249073adb67cf0acbb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page