WebScrapping in 3 lines of code
Project description
ScrapKit 2.2 - WebScrapping is now easy
Usage
In ScrapKit 2.2 We added a Function from which Now you can scrape multiple pages of a website that you can print or save to any type of file. Also we added two more functions cleanedText and getAttributeFromElements
Syntax
import scrapkit as sk
# Example URL
url = "https://en.wikipedia.org"
# Get attributes (example: get all 'src' attributes from 'img' tags)
image_src_attributes = sk.getAttributeFromElements(url, 'img', 'src')
sk.saveFile (image_src_attributes, filename = 'att.txt')
# OR
print (image_src_attributes)
# Scrape multiple pages
base_url = "https://pypi.org/"
num_pages = 3
all_links = sk.scrapeMultiplePages(base_url, num_pages)
sk.saveFile (all_links, filename = 'links.txt')
# OR
print (all_links)
# Clean text
dirty_text = "<p>Hello <b>world</b>!</p>"
cleaned_text = sk.cleanText(dirty_text)
sk.saveFile (f'cleaned text: {cleaned_text}', filename = 'cleaned.txt')
# OR
print (cleaned_text)
Founder
ScrapKit is a very useful python package that is made by "Ali Lodhi". Ali Lodhi is from Pakistan, He loves to write code in python and to ease people's work. Recently Ali Lodhi is working on the next update of this package
Versions
ScrapKit 1 (Old)
- ScrapKit 1.0 - It provide you to fetch whole HTML of the website
- ScrapKit 1.1 - It provides you to fetch HTML, Title and the Text of the website
- ScrapKit 1.2 - You can save the Fetched HTML in a
.htmlfile - ScrapKit 1.3 - Bug Fixes
- ScrapKit 1.4 - You can fetch the links given in any website
- ScrapKit 1.5 - You can fetch the URL of the image on the webpage
- ScrapKit 1.6 - Now you can get Elements data by using their IDs and Classes
- ScrapKit 1.7 - Bug Fixes
ScrapKit 2 (Latest)
- ScrapKit 2.0 - Added 4 new functions
- ScrapKit 2.1 - Added
saveFilefunction - ScrapKit 2.2 - Added 3 new functions
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scrapkit-2.2.tar.gz.
File metadata
- Download URL: scrapkit-2.2.tar.gz
- Upload date:
- Size: 3.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd81bd46828d0d55b8047deaffc4510d7d26a3f10f26bcc6dc286bef8ee131a6
|
|
| MD5 |
5d5839068c96f65d182c9908dce8e52e
|
|
| BLAKE2b-256 |
8feb0c1454054c344e4784e9fc4408983ec492559b6001aa480b1514ccf34901
|
File details
Details for the file scrapkit-2.2-py3-none-any.whl.
File metadata
- Download URL: scrapkit-2.2-py3-none-any.whl
- Upload date:
- Size: 3.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5bf36606225e57cb33c5c99dd1858982f1b1b38a7dbb462be220fa3451f6741b
|
|
| MD5 |
b3ecad9597659ba63135004ae290169c
|
|
| BLAKE2b-256 |
c33104387080a26b159576786dee40059576117366126f0972c14f4a7dea111b
|