Extracting information from the website
Project description
putali
Under construction! Give a try at it!
Developed by Ujjawal Shah (c) 2022
Examples of How To Use Package
Count number of total URL
import putali
urls = putali.Getallurls('https://www.bok.com.np/')
total_unique_url = len(urls.uniqueurls())
print(f'total unique urls: {total_unique_url}')
Get all urls from which information can be extracted. Those webpages which are non-image can only be used to extract information i.e webpages excluding '.pdf', '.jpeg', '.jpg', '.zip', '.png' extension
import putali
urls = putali.Getallurls('https://www.bok.com.np/')
#usefulurls() will filter the urls except the ones with '.pdf', '.jpeg', '.jpg', '.zip', '.png' extension
useful_urls = urls.usefulurls()
print(f'useful urls: {useful_urls}')
Print all the emails from the website
import putali
urls = putali.Getallurls('https://www.bok.com.np/')
email_address = urls.emails()
print(f'emails: {email_address}')
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
putali-0.0.1.tar.gz
(3.1 kB
view hashes)
Built Distribution
putali-0.0.1-py3-none-any.whl
(3.3 kB
view hashes)