A package that allows users to capture full-page screenshots of websites using Selenium and Chrome webdriver.
Project description
Pywebcapture
Allows you to loop through a list of uri's and grab a screenshot that can be saved to disk.
Tested with Python version 3.8.3
Installation
- Install Python version 3.8.3
- Install Git
- Install virtualenv
pip install virtualenv
- Change to desired directory
cd my/workspace/directory/here
- Clone the repo
git clone https://github.com/wirelessfuture/pywebcapture.git
- Change to newly cloned folder
cd pywebcapture
- Create a virtual environment
virtualenv venv
- Activate virtual environment - Windows:
.\venv\Scripts\activate
Linuxsource ./venv/bin/activate
- Install requirements
pip install -r requirements.txt
Basic Usage
Import the modules:
from driver import Driver as d
from loader import CSVLoader
Use the CSVLoader to load your csv file containing the urls and optional file names:
Options:
- input_filepath - The absolute path to your csv file (str)
- has_header - Whether your csv has a header row or now (bool)
- uri_column - The column that contains the uri's, can use either column name (str) or the index position (int)
- filename_column - The column that contains the desired file names (str), can be set to None, where the driver will use the uri netloc as the filename
csv_file = CSVLoader("example.csv", True, 3, None)
Call the get_uri_dict() method from the CSVLoader instance, this parses the CSV into a Python dictionary:
uri_dict = csv_file.get_uri_dict()
Create instance of the web driver:
Options:
- output_path - This is the output path that you want to save screen shots at (str), setting to None will output all files to ./screenshots
- delay - This is the delay in seconds between each page request, minimum is 2 seconds, please crawl pages respectfully :)
- uri_dict - The Python dictionary containing your file names and uri's
d = d(None, 3, uri_dict)
Run the driver, this will loop through all uri's, get the maximum scrollheight and then take a screenshot
d.run()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pywebcapture-0.0.1.tar.gz
(5.3 kB
view hashes)
Built Distribution
Close
Hashes for pywebcapture-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d1abe4ae1f245f95c1d3a13664c5b87bab0f532eae29d017b958a832537575c8 |
|
MD5 | 98466ae17407815402732089ba4afc12 |
|
BLAKE2b-256 | e1f218a3d0ce52355136e44d46dd49ce6dc83e6b8134c331de6e67a4b77269de |