Skip to main content

Convenient utility to parse and normalize urls.

Project description

Normal URL

Convenient utility to parse and normalize urls.

from normalurl import *

print(parse_url('localhost'))
print(parse_url('localhost:12345'))
print(normalize_url('localhost', scheme='http', port='1122'))
print(parse_normalized_url('localhost:12345', scheme='http', port='1122'))

URL(scheme='', hostname='localhost', port='', path='', query='', fragment='', netloc='localhost', username='', password='')
URL(scheme='', hostname='localhost', port=12345, path='', query='', fragment='', netloc='localhost:12345', username='', password='')
http://localhost:1122
URL(scheme='http', hostname='localhost', port=12345, path='', query='', fragment='', netloc='localhost:12345', username='', password='')

Usage

Normalize URL

Function normalize_url(url: Union[str, URL, ParseResult], *, scheme: str = '', hostname: str = '', port: str = '', path: str = '', query: str = '', fragment: str = '', username: str = '', password: str = '') -> str ensures each element of url has non-empy value otherwise sets default one.

Just do nothing:

print(normalize_url('localhost'))

localhost

Ensure url describes scheme and port. If not persist user http scheme and port 1122.

print(normalize_url('localhost', scheme='http', port='1122'))
print(normalize_url('https://localhost', scheme='http', port='1122'))
print(normalize_url('localhost:12345', scheme='http', port='1122'))
print(normalize_url('https://localhost:12345', scheme='http', port='1122'))

http://localhost:1122
https://localhost:1122
http://localhost:12345
https://localhost:12345

Parse URL

Function parse_url(url: str) -> URL parse url elements in named tuple structure URL. Default value for each element is empty string ''.

Parsed elements are:

  • scheme
  • hostname
  • port
  • path
  • query
  • fragment
  • netloc
  • username
  • password
print(parse_url('localhost'))
print(parse_url('localhost:12345'))
print(parse_url('http://example.org'))
print(parse_url('https://example.org:8080/user/search?name=Bob&age=30#profile'))
print(parse_url('https://admin:123@example.org:8080/user/search?name=Bob&age=30#profile'))

URL(scheme='', hostname='localhost', port='', path='', query='', fragment='', netloc='localhost', username='', password='')
URL(scheme='', hostname='localhost', port=12345, path='', query='', fragment='', netloc='localhost:12345', username='', password='')
URL(scheme='http', hostname='example.org', port='', path='', query='', fragment='', netloc='example.org', username='', password='')
URL(scheme='https', hostname='example.org', port=8080, path='/user/search', query='name=Bob&age=30', fragment='profile', netloc='example.org:8080', username='', password='')
URL(scheme='https', hostname='example.org', port=8080, path='/user/search', query='name=Bob&age=30', fragment='profile', netloc='admin:123@example.org:8080', username='admin', password='123')

Parse normalized URL

Function parse_normalized_url(url: Union[str, ParseResult], *, scheme: str = '', hostname: str = '', port: str = '', path: str = '', query: str = '', fragment: str = '', username: str = '', password: str = '') -> URL works exactly like normalize_url + parse_url - normalizes url then parses it.

Ensure url describes scheme and port. If not persist user http scheme and port 1122.

print(parse_normalized_url('localhost', scheme='http', port='1122'))
print(parse_normalized_url('https://localhost', scheme='http', port='1122'))
print(parse_normalized_url('localhost:12345', scheme='http', port='1122'))
print(parse_normalized_url('https://localhost:12345', scheme='http', port='1122'))

URL(scheme='http', hostname='localhost', port=1122, path='', query='', fragment='', netloc='localhost:1122', username='', password='')
URL(scheme='https', hostname='localhost', port=1122, path='', query='', fragment='', netloc='localhost:1122', username='', password='')
URL(scheme='http', hostname='localhost', port=12345, path='', query='', fragment='', netloc='localhost:12345', username='', password='')
URL(scheme='https', hostname='localhost', port=12345, path='', query='', fragment='', netloc='localhost:12345', username='', password='')

Why not just use urllib?

Because urlib does not cover some common edge cases.

Just rewrite the first example with urllib:

from urllib.parse import urlparse

print(urlparse('localhost'))
print(urlparse('localhost:12345'))

ParseResult(scheme='', netloc='', path='localhost', params='', query='', fragment='')
ParseResult(scheme='localhost', netloc='', path='12345', params='', query='', fragment='')

As you can see, urllib considers localhost as path while it is actually hostname. Moreover, <host>:<port> are parsed as scheme (like http) for localhost and path for port which is absolutely wrong.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

normalurl-1.0.1.tar.gz (3.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

normalurl-1.0.1-py2.py3-none-any.whl (2.9 kB view details)

Uploaded Python 2Python 3

File details

Details for the file normalurl-1.0.1.tar.gz.

File metadata

  • Download URL: normalurl-1.0.1.tar.gz
  • Upload date:
  • Size: 3.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.31.0

File hashes

Hashes for normalurl-1.0.1.tar.gz
Algorithm Hash digest
SHA256 e5a293f6b17f212ab327d67096357b6c22986c350b01639240e0f193bcc74cf9
MD5 2316da8003373adea768c3bef32788d0
BLAKE2b-256 54127cd79db5c0e66455451440c1553c0f8a49c632e8be90959461cea0b7b9e4

See more details on using hashes here.

File details

Details for the file normalurl-1.0.1-py2.py3-none-any.whl.

File metadata

  • Download URL: normalurl-1.0.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 2.9 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.31.0

File hashes

Hashes for normalurl-1.0.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 5a7a5ae78ee1a600b4af38e5bab2df59d2bd4abd51c9530df9f6259375c11b1c
MD5 9a27b22dafc8bbdb6d73dd3dce3e6e85
BLAKE2b-256 da3d4cd505af722ec4f290635b9554d01601b75c1ca6764fcbedfc877d6e6664

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page