A Python library to crawl the details of a URL.
Project description
Overview
url_crawler is a Python library to crawl the details of a URL.
Usage
from url_crawler import url_crawler
'''
url -> string URL to crawl for information.
'''
package_details = url_crawler(url)
print(package_details.url)
print(package_details.domain)
print(package_details.check_https)
print(package_details.dot_count)
print(package_details.digit_count)
print(package_details.url_length)
Utilities
| Name | Output | Description |
|---|---|---|
| url | str | Returns the string url. |
| domain | str | Returns the domain of the url. |
| registrar | str | Returns the registrar for the given URL. |
| registered_country | str | Returns the registered domain country of the given URL. |
| whois | dict | Returns the whois information of the given URL. |
| registration_date | int | Returns the number of days since registration of the given URL. |
| expiry_date | int | Returns the number of days to expiration of the given URL. |
| intended_lifespan | int | Returns the number of days of intended lifespan of the given URL. |
| dot_count | int | Returns the dot(.) count in the given URL. |
| digit_count | int | Returns the digit count in the given URL. |
| url_length | int | Returns the length of the given URL. |
| fragments_count | int | Returns the fragment counts in the given URL. |
| entropy | int | Returns the entropy of the given URL. |
| check_http | bool | Checks for http headers in the given URL. |
| check_http | bool | Checks for https headers in the given URL. |
| url_response | bool | Checks for the URL response. |
| check_encoding | bool | Checks for encoding in in the given URL. |
| check_client | bool | Checks for client keyword in the given URL. |
| check_admin | bool | Checks for admin keyword in the given URL. |
| check_server | bool | Checks for server keyword in the given URL. |
| check_login | bool | Checks for login keyword in the given URL. |
| check_ports | bool | Checks for any ports in the given URL. |
Requirements
The requirements.txt file has details of all Python libraries for this package, and can be installed using
pip install -r requirements.txt
Organization
├── src
│ ├── url_crawler
├── init <- init
├── url_crawler <- package source code for URL crawler
├── setup.py <- setup file
├── LICENSE <- LICENSE
├── README.md <- README
├── CONTRIBUTING.md <- contribution
├── test.py <- test cases for unit testing
├── requirements.txt <- requirements file for reproducing the code package
License
MIT
Contributions
For steps on code contribution, please see CONTRIBUTING.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
url_crawler-1.0.0.tar.gz
(4.1 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file url_crawler-1.0.0.tar.gz.
File metadata
- Download URL: url_crawler-1.0.0.tar.gz
- Upload date:
- Size: 4.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.26.0 requests-toolbelt/0.9.1 urllib3/1.26.7 tqdm/4.62.3 importlib-metadata/4.8.1 keyring/23.1.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
21c7dfcd132ad400d95df2336a1e4f8b2b66d8c3bda49cb7a28ec676b1400dfa
|
|
| MD5 |
75a8e61fd51689cdff447deb4b4d25b6
|
|
| BLAKE2b-256 |
9054be2bbcc941df501e27d5b25723c539fb11b8b673c7fd1769a7046d005694
|
File details
Details for the file url_crawler-1.0.0-py3-none-any.whl.
File metadata
- Download URL: url_crawler-1.0.0-py3-none-any.whl
- Upload date:
- Size: 4.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.26.0 requests-toolbelt/0.9.1 urllib3/1.26.7 tqdm/4.62.3 importlib-metadata/4.8.1 keyring/23.1.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6b5835cf494b9bbc83dc9518ed48747414a547402ca911c1887cf8ebced1ffc0
|
|
| MD5 |
fae1443c47e01ebda9a59ea6ff953e60
|
|
| BLAKE2b-256 |
92d5def49b1434b576cda9f116d8b013c5d84394f289180cfc7ca41b8f7c74d6
|