Library to strip http urls of tracking elements, and use shorthand variants of urls
Project description
Url Strip
A library for stripping urls of tracking and bloat
Non Pythonic
Usage was designed around typesafety and zero runtime surprises, so instead of raising exceptions, url_strip will return exceptions instead of results, this library works best with a typechecker because of this
Usage
Some basic usage. These examples are all runnable and meet typing standards.
Using is_instance and unwrap
"""
Uses is_instance and unwrap to parse a url
"""
from url_strip import Ok, strip_url
# Ok and Err provide the same 3 methods with fully typed outcomes for a Result[T, E]:
# is_instance will TypeGuard the result to the variant
# get will return an Optional[T] of the variant
# (and is not suitable for cases where T or E are None)
# unwrap with return the requested variant or raise an exception
if Ok.is_instance(
v := strip_url(
"https://youtube.com/watch?v=dQw4w9WgXcQ"
"&trackerinfo=youraddresshere&mldata=whattimeyouwokeupthismorning"
)
):
# using unwrap is runtime safe here as we just typeguarded it is an Ok variant
url = Ok.unwrap(v)
# strip_url returns a Result[HttpUrl, UrlError], and HttpUrl provides a into_str method
# to get what most people expect as a final output
value = url.into_str()
print(value) # -> https://youtu.be/dQw4w9WgXcQ
else:
... # Insert error handling here
Using get
"""
Uses get and a None check to parse an invalid url
"""
from url_strip import strip_url, Ok, Err
if (v := Ok.get(result := strip_url("foo"))) is not None:
v.into_str()
else:
raise Err.unwrap(result)
Writing extra domain rules
"""
Uses register to add foo.com as a checked domain
"""
from url_strip import UrlError, Ok, Err, StripFuncResult, HttpUrl, register
@register(domain="foo.com")
def foo_com_strip( # pyright: ignore[reportUnusedFunction]
url: HttpUrl
) -> StripFuncResult: # ('ok', HttpUrl) | ('err', UrlError)
if url.path == "/failure":
return Err(UrlError("This is an awful failure state"))
return Ok(url) # dont change url at all, its perfect as is
Popular Websites Supported
Some popular websites have pre-register
'd rules, while any non registered websites will go to a default query stripper
The pre registered domains are as follows:
Site | Domains |
---|---|
amazon | www.amazon.com , www.amazon.co.uk |
ebay | ebay.com , www.ebay.com , www.ebay.co.uk |
www.reddit.com |
|
tiktok | www.tiktok.com , vm.tiktok.com |
twitter.com |
|
youtube | youtube.com , www.youtube.com |
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
url_strip-0.0.3.tar.gz
(9.3 kB
view hashes)
Built Distribution
Close
Hashes for url_strip-0.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 07cf13eab91d9cb49b22cc0c3944915b326b6d87ed6601d3d3da17c43b5e7a6e |
|
MD5 | d789e5f395fdbf56bd3b8194141bf58b |
|
BLAKE2b-256 | 316224ecc4e45e7b6b0d1b6a2aac692a0bcfc1d0669661e6aa204fc368731afb |