Skip to main content

A Python module for web scraping with Selenium and BeautifulSoup

Project description

Selestium

Selestium is a Python module for web scraping and automation using Selenium WebDriver.

Features

  • Provides a high-level interface for interacting with HTML content in web pages.
  • Supports rendering JavaScript-based web pages using headless browsers (Firefox and Chrome).
  • Allows easy navigation, element identification, and data extraction from web pages.

Installation

You can install Selestium using pip:

pip install selestium

Dependencies for Termux

In Termux you need some dependencies to work. Later it will bee automatic.

!!CHROME DOES NOT WORK JUST FIREFOX IN TERMUX!!

First update and then install tur and x11 repos

pkg update -y; pkg install -y tur-repo x11-repo

Then install firefox and geckodriver

pkg install -y firefox geckodriver

And you are ready to go..

Dependencies for Linux

In Linux also you need get Firefox dependencies.

Please note that GNU/Linux distributors may provide packages for your distribution which have different requirements.

Firefox will not run at all without the following libraries or packages: glibc 2.17 or higher GTK+ 3.14 or higher libglib 2.42 or higher libstdc++ 4.8.1 or higher X.Org 1.0 or higher (1.7 or higher is recommended) For optimal functionality, we recommend the following libraries or packages: DBus 1.0 or higher NetworkManager 0.7 or higher PulseAudio

For Debian-based distros:

sudo apt update -y && sudo apt install -y \
    libc6 \
    libgtk-3-0 \
    libglib2.0-0 \
    libstdc++6 \
    xorg

Usage

Here's a basic example of how to use Selestium to render a web page and extract information:

Make a Request Without Rendering:

from Selestium import HTMLNavigator

# Initialize a HTMLNavigator instance with default settings (Firefox browser)
navigator = HTMLNavigator()

# Make a GET request to a web page without rendering
response = navigator.get("https://www.example.com")

# Extract information from the response
print(response.text)

Make a Request With Rendering:

from Selestium import HTMLNavigator

# Initialize a HTMLNavigator instance with Firefox browser
navigator = HTMLNavigator(browser='firefox')

# Get a web page and render it using the browser
response = navigator.get("https://www.example.com", render=True)

# Extract information from the rendered page
titles = response.find("h1")
for title in titles:
    print(title.text)

Using the Controller Method:

from Selestium import HTMLNavigator

# Initialize a HTMLNavigator instance with Chrome browser
navigator = HTMLNavigator(browser='chrome')

# Get the browser controller (WebDriver) instance
driver = navigator.browser_controller()

# Navigate to a web page
driver.get("https://www.example.com")

# Perform additional actions using the browser controller
# For example, click a button or fill out a form
# driver.find_element_by_id("button_id").click()

Contributing

Contributions are welcome! If you encounter any issues or have suggestions for improvement, please open an issue or submit a pull request on GitHub.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

selestium-0.2.1.tar.gz (16.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

selestium-0.2.1-py3-none-any.whl (18.0 kB view details)

Uploaded Python 3

File details

Details for the file selestium-0.2.1.tar.gz.

File metadata

  • Download URL: selestium-0.2.1.tar.gz
  • Upload date:
  • Size: 16.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.12

File hashes

Hashes for selestium-0.2.1.tar.gz
Algorithm Hash digest
SHA256 8b6cb79b31395958068f1719de13042a47161a9d7ef1b0a130aa2ddd5d7fdeaa
MD5 602b9b934610e9a016cf223a33608c4b
BLAKE2b-256 8aacc0970713f3500fba21a905d33a6a16bce71071defc0c5a0bf0337345330c

See more details on using hashes here.

File details

Details for the file selestium-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: selestium-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 18.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.12

File hashes

Hashes for selestium-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5d8367d436c6b6b0104adb1e55eec80a2e1b13f0956b2a84465dd18a6df5c195
MD5 d46c28694a1e0894f54d06815b85aa19
BLAKE2b-256 a604339261a46d598fbfe98072b5e0dddc7b256f9844b48137e10d2de401b8f7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page