Skip to main content

A Python library to crawl the details of a URL.

Project description

Overview


url_crawler is a Python library to crawl the details of a URL.

Usage

from url_crawler import url_crawler
'''
  url -> string URL to crawl for information.
'''
package_details = url_crawler(url)

print(package_details.url)
print(package_details.domain)
print(package_details.check_https)
print(package_details.dot_count)
print(package_details.digit_count)
print(package_details.url_length)

Utilities

Name Output Description
url str Returns the string url.
domain str Returns the domain of the url.
registrar str Returns the registrar for the given URL.
registered_country str Returns the registered domain country of the given URL.
whois dict Returns the whois information of the given URL.
registration_date int Returns the number of days since registration of the given URL.
expiry_date int Returns the number of days to expiration of the given URL.
intended_lifespan int Returns the number of days of intended lifespan of the given URL.
dot_count int Returns the dot(.) count in the given URL.
digit_count int Returns the digit count in the given URL.
url_length int Returns the length of the given URL.
fragments_count int Returns the fragment counts in the given URL.
entropy int Returns the entropy of the given URL.
check_http bool Checks for http headers in the given URL.
check_http bool Checks for https headers in the given URL.
url_response bool Checks for the URL response.
check_encoding bool Checks for encoding in in the given URL.
check_client bool Checks for client keyword in the given URL.
check_admin bool Checks for admin keyword in the given URL.
check_server bool Checks for server keyword in the given URL.
check_login bool Checks for login keyword in the given URL.
check_ports bool Checks for any ports in the given URL.

Requirements

The requirements.txt file has details of all Python libraries for this package, and can be installed using

pip install -r requirements.txt

Organization

├── src
│   ├── url_crawler
          ├── init             <- init
          ├── url_crawler      <- package source code for URL crawler
├── setup.py             <- setup file 
├── LICENSE              <- LICENSE
├── README.md            <- README
├── CONTRIBUTING.md      <- contribution
├── test.py              <- test cases for unit testing
├── requirements.txt     <- requirements file for reproducing the code package

License

MIT

Contributions

For steps on code contribution, please see CONTRIBUTING.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

url_crawler-1.0.0.tar.gz (4.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

url_crawler-1.0.0-py3-none-any.whl (4.4 kB view details)

Uploaded Python 3

File details

Details for the file url_crawler-1.0.0.tar.gz.

File metadata

  • Download URL: url_crawler-1.0.0.tar.gz
  • Upload date:
  • Size: 4.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.26.0 requests-toolbelt/0.9.1 urllib3/1.26.7 tqdm/4.62.3 importlib-metadata/4.8.1 keyring/23.1.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for url_crawler-1.0.0.tar.gz
Algorithm Hash digest
SHA256 21c7dfcd132ad400d95df2336a1e4f8b2b66d8c3bda49cb7a28ec676b1400dfa
MD5 75a8e61fd51689cdff447deb4b4d25b6
BLAKE2b-256 9054be2bbcc941df501e27d5b25723c539fb11b8b673c7fd1769a7046d005694

See more details on using hashes here.

File details

Details for the file url_crawler-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: url_crawler-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 4.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.26.0 requests-toolbelt/0.9.1 urllib3/1.26.7 tqdm/4.62.3 importlib-metadata/4.8.1 keyring/23.1.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.7

File hashes

Hashes for url_crawler-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6b5835cf494b9bbc83dc9518ed48747414a547402ca911c1887cf8ebced1ffc0
MD5 fae1443c47e01ebda9a59ea6ff953e60
BLAKE2b-256 92d5def49b1434b576cda9f116d8b013c5d84394f289180cfc7ca41b8f7c74d6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page