Convenient utility to parse and normalize urls.
Project description
Normal URL
Convenient utility to parse and normalize urls.
from normalurl import *
print(parse_url('localhost'))
print(parse_url('localhost:12345'))
print(normalize_url('localhost', scheme='http', port='1122'))
print(parse_normalized_url('localhost:12345', scheme='http', port='1122'))
URL(scheme='', hostname='localhost', port='', path='', query='', fragment='', netloc='localhost', username='', password='')
URL(scheme='', hostname='localhost', port=12345, path='', query='', fragment='', netloc='localhost:12345', username='', password='')
http://localhost:1122
URL(scheme='http', hostname='localhost', port=12345, path='', query='', fragment='', netloc='localhost:12345', username='', password='')
Usage
Normalize URL
Function normalize_url(url: Union[str, URL, ParseResult], *, scheme: str = '', hostname: str = '', port: str = '', path: str = '', query: str = '', fragment: str = '', username: str = '', password: str = '') -> str
ensures each element of url has non-empy value otherwise sets default one.
Just do nothing:
print(normalize_url('localhost'))
localhost
Ensure url describes scheme and port. If not persist user http scheme and port 1122.
print(normalize_url('localhost', scheme='http', port='1122'))
print(normalize_url('https://localhost', scheme='http', port='1122'))
print(normalize_url('localhost:12345', scheme='http', port='1122'))
print(normalize_url('https://localhost:12345', scheme='http', port='1122'))
http://localhost:1122
https://localhost:1122
http://localhost:12345
https://localhost:12345
Parse URL
Function parse_url(url: str) -> URL
parse url elements in named tuple structure URL
. Default value for each element is empty string ''
.
Parsed elements are:
- scheme
- hostname
- port
- path
- query
- fragment
- netloc
- username
- password
print(parse_url('localhost'))
print(parse_url('localhost:12345'))
print(parse_url('http://example.org'))
print(parse_url('https://example.org:8080/user/search?name=Bob&age=30#profile'))
print(parse_url('https://admin:123@example.org:8080/user/search?name=Bob&age=30#profile'))
URL(scheme='', hostname='localhost', port='', path='', query='', fragment='', netloc='localhost', username='', password='')
URL(scheme='', hostname='localhost', port=12345, path='', query='', fragment='', netloc='localhost:12345', username='', password='')
URL(scheme='http', hostname='example.org', port='', path='', query='', fragment='', netloc='example.org', username='', password='')
URL(scheme='https', hostname='example.org', port=8080, path='/user/search', query='name=Bob&age=30', fragment='profile', netloc='example.org:8080', username='', password='')
URL(scheme='https', hostname='example.org', port=8080, path='/user/search', query='name=Bob&age=30', fragment='profile', netloc='admin:123@example.org:8080', username='admin', password='123')
Parse normalized URL
Function parse_normalized_url(url: Union[str, ParseResult], *, scheme: str = '', hostname: str = '', port: str = '', path: str = '', query: str = '', fragment: str = '', username: str = '', password: str = '') -> URL
works exactly like normalize_url
+ parse_url
- normalizes url then parses it.
Ensure url describes scheme and port. If not persist user http scheme and port 1122.
print(parse_normalized_url('localhost', scheme='http', port='1122'))
print(parse_normalized_url('https://localhost', scheme='http', port='1122'))
print(parse_normalized_url('localhost:12345', scheme='http', port='1122'))
print(parse_normalized_url('https://localhost:12345', scheme='http', port='1122'))
URL(scheme='http', hostname='localhost', port=1122, path='', query='', fragment='', netloc='localhost:1122', username='', password='')
URL(scheme='https', hostname='localhost', port=1122, path='', query='', fragment='', netloc='localhost:1122', username='', password='')
URL(scheme='http', hostname='localhost', port=12345, path='', query='', fragment='', netloc='localhost:12345', username='', password='')
URL(scheme='https', hostname='localhost', port=12345, path='', query='', fragment='', netloc='localhost:12345', username='', password='')
Why not just use urllib?
Because urlib does not cover some common edge cases.
Just rewrite the first example with urllib:
from urllib.parse import urlparse
print(urlparse('localhost'))
print(urlparse('localhost:12345'))
ParseResult(scheme='', netloc='', path='localhost', params='', query='', fragment='')
ParseResult(scheme='localhost', netloc='', path='12345', params='', query='', fragment='')
As you can see, urllib considers localhost as path while it is actually hostname. Moreover, <host>:<port> are parsed as scheme (like http) for localhost and path for port which is absolutely wrong.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file normalurl-1.0.0.tar.gz
.
File metadata
- Download URL: normalurl-1.0.0.tar.gz
- Upload date:
- Size: 3.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.28.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2bd2a1b85d7ab5924e4a8bcbed7ac97132460d814d4ff5bda15fba723ee2f5df |
|
MD5 | 319ecd6b865f64408ef245f506f6bb90 |
|
BLAKE2b-256 | 3f4ef9a09af932c6a18ed4307be7d79b680c4adde03b7ebb6ed0ed80c6b8e2ba |
File details
Details for the file normalurl-1.0.0-py2.py3-none-any.whl
.
File metadata
- Download URL: normalurl-1.0.0-py2.py3-none-any.whl
- Upload date:
- Size: 2.9 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.28.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1a828d5f564bee21a2d605bb49df67079de07a464bcae4faee7fa828691880d9 |
|
MD5 | bbf53b792b06d7d963ad4b049ccabb2f |
|
BLAKE2b-256 | 1630df85314a0bf577ca0fcddb2b41d3f2556c858ec76c73fe233eae3ffdedab |