A comprehensive and versatile Python module for web scraping.
Project description
# PowerScrape
PowerScrape is a comprehensive and versatile Python module for web scraping. It provides powerful functionalities to extract various types of content from the web, including HTML, JSON, images, PDFs, and more. The module also includes features for handling different types of web requests, handling errors, and cloning entire websites into local files.
## Features
- Scraping HTML content from web pages
- Scraping JSON data from APIs
- Downloading images from a web page
- Rendering JavaScript-based web pages
- Extracting text from PDF files
- Extracting data points from chart images
- Making HTTP POST requests
- Handling various HTTP status codes
- Cloning entire websites into local files
## Installation
You can install PowerScrape using pip:
```bash
pip install PowerScrape
Usage
from PowerScrape import Scraper
# Create an instance of the Scraper class
scraper = Scraper()
# Use the various methods provided by the Scraper class to perform web scraping operations
# Example: Scrape a normal HTML page
soup = scraper.scrape_html('http://example.com')
# Example: Scrape a JSON API
data = scraper.scrape_json('http://api.example.com/data')
# Example: Scrape and download images
scraper.scrape_images('http://example.com/gallery', 'scraped_images')
# Example: Scrape JavaScript rendered page
html = scraper.scrape_javascript('http://example.com')
# Example: Extract text from a PDF file
text = scraper.scrape_pdf('http://example.com/doc.pdf')
# Example: Extract data points from a chart image
data = scraper.scrape_graph('http://example.com/chart.png')
# Example: Make an HTTP POST request
response = scraper.http_post_request('http://example.com/post_endpoint', data={'key': 'value'})
# Example: Clone a website
scraper.clone_website('http://example.com', 'cloned_website')
License
This project is licensed under the MIT License - see the LICENSE file for details.
Made by ^mind-set#0001
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
PowerScrape-0.1.2.tar.gz
(4.8 kB
view details)
Built Distribution
File details
Details for the file PowerScrape-0.1.2.tar.gz
.
File metadata
- Download URL: PowerScrape-0.1.2.tar.gz
- Upload date:
- Size: 4.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd85266d10de12f9b8b4a937513331d222c4bf93efde47f66ff38b2ed8d37297 |
|
MD5 | 22e928bd36e571bffb8f342b596e17da |
|
BLAKE2b-256 | 107bcfbd13ffe85fd38bdce2a029e415aab2d138a63fa53ffe04de44fa755e9d |
File details
Details for the file PowerScrape-0.1.2-py3-none-any.whl
.
File metadata
- Download URL: PowerScrape-0.1.2-py3-none-any.whl
- Upload date:
- Size: 5.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0cd05cc3cb25bfdffc09a526d8b037afa1f03b59623f144376d6d02ddc0510c6 |
|
MD5 | c0177227e0abaaa3838a20eb90ad7519 |
|
BLAKE2b-256 | 1f7b0289d4fcb4063bb9b1659214cfb12a2d5fb0ede4097a6572a0c82159b939 |