A helper library full of URL-related heuristics.
Project description
Ural
A helper library full of URL-related heuristics.
Installation
You can install ural
with pip with the following command:
pip install ural
Usage
ensure_protocol
A function checking if the url has a protocol, and adding the given one if there is none.
from ural import ensure_protocol
ensure_protocol('www2.lemonde.fr', protocol='https')
>>> 'https://www2.lemonde.fr'
Arguments
- url string: URL to format.
- protocol string: protocol to use if there is none in url. Is 'http' by default.
force_protocol
A function force-replacing the protocol of the given url.
from ural import force_protocol
force_protocol('https://www2.lemonde.fr', protocol='ftp')
>>> 'ftp://www2.lemonde.fr'
Arguments
- url string: URL to format.
- protocol string: protocol wanted in the output url. Is
'http'
by default.
is_url
A function returning True if its argument is a url.
from ural import is_url
is_url('https://www2.lemonde.fr')
>>> True
Arguments
- string string: string to test.
- require_protocol boolean: whether the argument has to have a protocol to be considered a url. Is
True
by default.
normalize_url
Function normalizing the given url by stripping it of usually non-discriminant parts such as irrelevant query items or sub-domains etc.
This is a very useful utility when attempting to match similar urls written slightly differently when shared on social media etc.
from ural import normalize_url
normalize_url('https://www2.lemonde.fr/index.php?utm_source=google')
>>> 'lemonde.fr'
Arguments
- url string: URL to normalize.
- sort_query boolean [
True
]: whether to sort query items. - strip_authentication boolean [
True
]: whether to strip authentication. - strip_index boolean [
True
]: whether to strip trailing index. - strip_trailing_slash boolean [
False
]: whether to strip trailing slash.
strip_protocol
Function removing the protocol from the url.
from ural import strip_protocol
strip_protocol('https://www2.lemonde.fr/index.php')
>>> 'www2.lemonde.fr/index.php'
Arguments
- url string: URL to format.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.