Skip to main content

A web scraper that combines both Beautiful Soup (bs4) and Selenium.

Project description

PyWebScraper

A web scraper that combines both Beautiful Soup (bs4) and Selenium.

Want to support the development and stay updated?

Become a Patreon Donate using Liberapay

Installation

pip install PyWebScraper

Usage

from PyWebScraper import Scraper

Scraper() will load bs4 or selenium (defined with the input 'scraper_type') to then load a website (defined by the input 'url') and save it under Scraper().page.

Optional inputs for Scraper():

url = str (will be opened in scraper and page saved in Scraper().page)
scraper_type = 'bs4' or 'selenium'
scroll_down = boolean (scrolls down in selenium first before saving page)
user_agent = 'desktop' or 'mobile'
auto_close_selenium = boolean (if False, you can further interact with the selenium browser via Scraper().selenium)
selenium_remote_webdriver = str (IP for a remote webdriver for selenium, see https://www.selenium.dev/docs/site/en/remote_webdriver/remote_webdriver_client/)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

PyWebScraper-0.1.5.tar.gz (4.9 kB view details)

Uploaded Source

Built Distribution

PyWebScraper-0.1.5-py3-none-any.whl (6.2 kB view details)

Uploaded Python 3

File details

Details for the file PyWebScraper-0.1.5.tar.gz.

File metadata

  • Download URL: PyWebScraper-0.1.5.tar.gz
  • Upload date:
  • Size: 4.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.8.2

File hashes

Hashes for PyWebScraper-0.1.5.tar.gz
Algorithm Hash digest
SHA256 f1b829b4aa796ebe3b4da753eb43286ff2d15ab928b42539cc6258f58ad1012c
MD5 0aae267352c6ab697c4b6f9a4e083128
BLAKE2b-256 5983e2abdc5ea9f56d77b37dc8473b130799ddf117e6b63dc0360d4d3dff8592

See more details on using hashes here.

File details

Details for the file PyWebScraper-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: PyWebScraper-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 6.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.8.2

File hashes

Hashes for PyWebScraper-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 41a147d365d9803a27d7c1e37eea3c713ab2d1a6f5a4e557054ab69f7b68a2fe
MD5 add5d7af4029ac6d41e441d360eccbb6
BLAKE2b-256 2849d2b68f4d8047ab83b8afe067a4d1d89ad6d327f07beff797c326d45ba87a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page