Skip to main content

A wrapper around requests, BeautifulSoup, and Selenium (Chrome) to facilitate web scraping.

Project description

Description

A wrapper around requests, BeautifulSoup, and Selenium (Chrome) to facilitate web scraping.

Installation

pip install webmage

Usage

webmage contains a class called WebSpell. It takes 1 required argument: url. It takes 2 optional arguments: driverPath and encoding. If driverPath is left as None, it will use ChromeDriverManager to get the latest chromedriver based on your installation of Chrome.

spell = WebSpell(url='https://javascriptorian.com', driverPath=None, encoding='utf-8')

spell.get() - Get a static webpage using requests

You can use the .get() function to tell it to get a webpage using requests. This will automatically add the soup object of the webpage to the spell's object.

spell.get()
print(spell.soup)

spell.drive() - Get a dynamic webpage using selenium

You can use the .drive() function to tell WebSpell to get a webpage using selenium. This will automatically open a Chrome browser and add the soup object of the webpage to the spell's object, but soup will change whenever there is any interaction on the webpage. It has two optional arguments: nextURL and ghost. nextURL will allow you to change to a different URL after initialized a WebSpell. ghost will make it so that a browser is considered headless, and it does not open an explicit Chrome window.

spell = WebSpell('https://javascriptorian.com')
spell.drive()
spell.drive('https://google.com')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

webmage-0.0.3.tar.gz (4.1 kB view hashes)

Uploaded Source

Built Distribution

webmage-0.0.3-py3-none-any.whl (4.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page