Skip to main content

WebScrapping in 3 lines of code

Project description

ScrapKit 2.2 - WebScrapping is now easy

Usage

In ScrapKit 2.2 We added a Function from which Now you can scrape multiple pages of a website that you can print or save to any type of file. Also we added two more functions cleanedText and getAttributeFromElements

Syntax

import scrapkit as sk

# Example URL
url = "https://en.wikipedia.org"

# Get attributes (example: get all 'src' attributes from 'img' tags)
image_src_attributes = sk.getAttributeFromElements(url, 'img', 'src')
sk.saveFile (image_src_attributes, filename = 'att.txt')
# OR
print (image_src_attributes)

# Scrape multiple pages
base_url = "https://pypi.org/"
num_pages = 3
all_links = sk.scrapeMultiplePages(base_url, num_pages)
sk.saveFile (all_links, filename = 'links.txt')
# OR
print (all_links)

# Clean text
dirty_text = "<p>Hello <b>world</b>!</p>"
cleaned_text = sk.cleanText(dirty_text)
sk.saveFile (f'cleaned text: {cleaned_text}', filename = 'cleaned.txt')
# OR
print (cleaned_text)

Founder

ScrapKit is a very useful python package that is made by "Ali Lodhi". Ali Lodhi is from Pakistan, He loves to write code in python and to ease people's work. Recently Ali Lodhi is working on the next update of this package

Versions

ScrapKit 1 (Old)

  • ScrapKit 1.0 - It provide you to fetch whole HTML of the website
  • ScrapKit 1.1 - It provides you to fetch HTML, Title and the Text of the website
  • ScrapKit 1.2 - You can save the Fetched HTML in a .html file
  • ScrapKit 1.3 - Bug Fixes
  • ScrapKit 1.4 - You can fetch the links given in any website
  • ScrapKit 1.5 - You can fetch the URL of the image on the webpage
  • ScrapKit 1.6 - Now you can get Elements data by using their IDs and Classes
  • ScrapKit 1.7 - Bug Fixes

ScrapKit 2 (Latest)

  • ScrapKit 2.0 - Added 4 new functions
  • ScrapKit 2.1 - Added saveFile function
  • ScrapKit 2.2 - Added 3 new functions

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapkit-2.2.tar.gz (3.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scrapkit-2.2-py3-none-any.whl (3.9 kB view details)

Uploaded Python 3

File details

Details for the file scrapkit-2.2.tar.gz.

File metadata

  • Download URL: scrapkit-2.2.tar.gz
  • Upload date:
  • Size: 3.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for scrapkit-2.2.tar.gz
Algorithm Hash digest
SHA256 bd81bd46828d0d55b8047deaffc4510d7d26a3f10f26bcc6dc286bef8ee131a6
MD5 5d5839068c96f65d182c9908dce8e52e
BLAKE2b-256 8feb0c1454054c344e4784e9fc4408983ec492559b6001aa480b1514ccf34901

See more details on using hashes here.

File details

Details for the file scrapkit-2.2-py3-none-any.whl.

File metadata

  • Download URL: scrapkit-2.2-py3-none-any.whl
  • Upload date:
  • Size: 3.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for scrapkit-2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5bf36606225e57cb33c5c99dd1858982f1b1b38a7dbb462be220fa3451f6741b
MD5 b3ecad9597659ba63135004ae290169c
BLAKE2b-256 c33104387080a26b159576786dee40059576117366126f0972c14f4a7dea111b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page