Skip to main content

Boilerplate for developing crawler with Selenium.

Project description

Selenium Crawler Template

Boilerplate for developing crawler with Selenium.

Installation

pip install selenium-crawler-template

Usage

from selenium_crawler_template import Crawler

class MyCrawler(Crawler):
    @Crawler.open_url_in_new_tab
    def _get_email_from_profile(self, _):
        return self.find_element('a#email').get_attribute('href')

    def crawl(self, **kwargs):
        self.driver.get(kwargs['url'])

        for profile in self.find_elements('ul > .profile'):
            _ = self._get_email_from_profile(profile.get_attribute('href'))

        self._scroll_to_bottom()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

selenium-crawler-template-0.4.0.tar.gz (3.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

selenium_crawler_template-0.4.0-py3-none-any.whl (3.7 kB view details)

Uploaded Python 3

File details

Details for the file selenium-crawler-template-0.4.0.tar.gz.

File metadata

  • Download URL: selenium-crawler-template-0.4.0.tar.gz
  • Upload date:
  • Size: 3.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.5

File hashes

Hashes for selenium-crawler-template-0.4.0.tar.gz
Algorithm Hash digest
SHA256 3a376a824e3d91f20eecb9910d04b642dca856894e8fb80582f3e77e6949dc66
MD5 5bc55fdf7f6b82b2119ac05a6afde0cf
BLAKE2b-256 b7153a0f3923184a814aadac12d14c9db2ebff0de5dab615efe6d362b165dbb1

See more details on using hashes here.

File details

Details for the file selenium_crawler_template-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: selenium_crawler_template-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 3.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.5

File hashes

Hashes for selenium_crawler_template-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 65df5f2027ca91e1d7208ad84bf43e81e3e1ee2b177b9cd72d89248b6b76034d
MD5 bc449b7480505f98185052de09036558
BLAKE2b-256 f1a3e0f1d965d85449e8d080a491eb6a3666c7d7980068061e1979e7806cb05b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page