Extracting information from the website
Project description
putali
Under construction! Give a try at it!
Developed by Ujjawal Shah (c) 2022
Examples of How To Use Package
Count number of total URL
import putali
urls = putali.Getallurls('https://www.bok.com.np/')
total_unique_url = len(urls.uniqueurls())
print(f'total unique urls: {total_unique_url}')
Get all urls from which information can be extracted. Those webpages which are non-image can only be used to extract information i.e webpages excluding '.pdf', '.jpeg', '.jpg', '.zip', '.png' extension
import putali
urls = putali.Getallurls('https://www.bok.com.np/')
#usefulurls() will filter the urls except the ones with '.pdf', '.jpeg', '.jpg', '.zip', '.png' extension
useful_urls = urls.usefulurls()
print(f'useful urls: {useful_urls}')
Print all the emails from the website
import putali
urls = putali.Getallurls('https://www.bok.com.np/')
email_address = urls.emails()
print(f'emails: {email_address}')
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
putali-0.0.1.tar.gz
(3.1 kB
view details)
Built Distribution
File details
Details for the file putali-0.0.1.tar.gz
.
File metadata
- Download URL: putali-0.0.1.tar.gz
- Upload date:
- Size: 3.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.7.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0b4aebec23e9df6268c39d97f8deb26fc6e1e69e0c133d8a71ef749fd7063fa3 |
|
MD5 | 8ed7339162b3036d835e042623d56605 |
|
BLAKE2b-256 | 37e0466a6db3b4e2aec8425e3c9877943efc323aab1fc0c3a82d544e98c3ab2f |
File details
Details for the file putali-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: putali-0.0.1-py3-none-any.whl
- Upload date:
- Size: 3.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.7.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f72f6632e05d3bc4852b2b0a68e72fe81b85d5eb04321646c9b6248aad1f01e6 |
|
MD5 | 5d06274eb14e9dbba2b8e5fd0356f673 |
|
BLAKE2b-256 | 5fb24d598f425410f5f8dee038b6e05030afe11eb90af59be99ee52bc150730b |