Skip to main content

Library to strip http urls of tracking elements, and use shorthand variants of urls

Project description

Url Strip

A library for stripping urls of tracking and bloat

Non Pythonic

Usage was designed around typesafety and zero runtime surprises, so instead of raising exceptions, url_strip will return exceptions instead of results, this library works best with a typechecker because of this

Usage

Some basic usage. These examples are all runnable and meet typing standards.

Using is_instance and unwrap

"""
Uses is_instance and unwrap to parse a url
"""

from url_strip import Ok, strip_url

# Ok and Err provide the same 3 methods with fully typed outcomes for a Result[T, E]:
#  is_instance will TypeGuard the result to the variant
#  get will return an Optional[T] of the variant
#    (and is not suitable for cases where T or E are None)
#  unwrap with return the requested variant or raise an exception
if Ok.is_instance(
    v := strip_url(
        "https://youtube.com/watch?v=dQw4w9WgXcQ"
        "&trackerinfo=youraddresshere&mldata=whattimeyouwokeupthismorning"
        )
    ):
    # using unwrap is runtime safe here as we just typeguarded it is an Ok variant
    url = Ok.unwrap(v)
    # strip_url returns a Result[HttpUrl, UrlError], and HttpUrl provides a into_str method
    #  to get what most people expect as a final output
    value = url.into_str()

    print(value)  # -> https://youtu.be/dQw4w9WgXcQ
else:
    ...  # Insert error handling here

Using get

"""
Uses get and a None check to parse an invalid url
"""

from url_strip import strip_url, Ok, Err

if (v := Ok.get(result := strip_url("foo"))) is not None:
    v.into_str()
else:
    raise Err.unwrap(result)

Writing extra domain rules

"""
Uses register to add foo.com as a checked domain
"""
from url_strip import UrlError, Ok, Err, StripFuncResult, HttpUrl, register

@register(domain="foo.com")
def foo_com_strip( # pyright: ignore[reportUnusedFunction]
    url: HttpUrl
    ) -> StripFuncResult:  # ('ok', HttpUrl) | ('err', UrlError)

    if url.path == "/failure":
        return Err(UrlError("This is an awful failure state"))

    return Ok(url)  # dont change url at all, its perfect as is

Popular Websites Supported

Some popular websites have pre-register'd rules, while any non registered websites will go to a default query stripper
The pre registered domains are as follows:

Site Domains
amazon www.amazon.com, www.amazon.co.uk
ebay ebay.com, www.ebay.com, www.ebay.co.uk
reddit www.reddit.com
tiktok www.tiktok.com, vm.tiktok.com
twitter twitter.com
youtube youtube.com, www.youtube.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

url_strip-0.0.1.tar.gz (9.2 kB view hashes)

Uploaded Source

Built Distribution

url_strip-0.0.1-py3-none-any.whl (9.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page