Skip to main content

"A library for normalizing gemini:// URLs"

Project description

gemurl

A library for normalizing gemini:// URLs

Usage

The package can be installed from PyPI using pip:

$ pip install gemurl

Use it like this:

from gemurl import normalize_url

print(normalize_url("GEMINI://EXAMPLE.COM:1965"))
# Should print: gemini://example.com/

The package also includes a command like tool:

$ gemurl normalize GEMINI://EXAMPLE.COM:1965
gemini://example.com/

API of gemurl Module

The GemurlError is the base class for all exceptions raised by functions in this module.

normalize_url(url: str) -> str

Normalize a URL according to RFC 3986 and the Gemini specification. Raises NormalizarionError if the input is not a valid (gemini) URL. Raises NonGeminiUrlError (which is a subclass of NormalizarionError) if the scheme is non-gemini.

host_port_pair_from_url(normalized_url: str) -> tuple[str, int]

Get a (host, port) tuple suitable for connecting a socket. The input URL must be normalized for this function to work correctly.

capsule_prefix(normalized_url: str) -> str

Find the prefix of the URL that uniquely identifies its capsule. Two URLs belong to the same capsule if the hostnames and ports match, but if the path begins with "/~USER/" or "/users/USER/", then the USER part needs to match too.

This definition is borrowed from khuxkm's Molniya. Thanks!

Features

The normamlization function ensures that:

  • Scheme and host case are always lowercase.
  • The host is IDNA-encoded, if it has non-ASCII characters.
  • The port is removed if it has the default 1965 value.
  • Percent encoding hex digit are always uppercase.
  • Non-reserved characters are never percent encoded.
  • Non-printable (and non-ASCII) characters are always percent encoded.
  • Percent encoding of reserved characters are not changed (since that would change the meaning).
  • URLs with no paths get a single slash path.
  • . and .. path segments are collapsed.
  • Empty queries and fragments are distinct from missing ones.
  • See test cases in test_url.py in te source code for the full list.

Ideas for Future Improvements

  • Function to resolve relative URLs
  • Function to convert URLs to display form (decode percent and IDNA encoding)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gemurl-1.1.0.tar.gz (3.6 kB view details)

Uploaded Source

Built Distribution

gemurl-1.1.0-py3-none-any.whl (3.9 kB view details)

Uploaded Python 3

File details

Details for the file gemurl-1.1.0.tar.gz.

File metadata

  • Download URL: gemurl-1.1.0.tar.gz
  • Upload date:
  • Size: 3.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.2

File hashes

Hashes for gemurl-1.1.0.tar.gz
Algorithm Hash digest
SHA256 1690471cfd3dd505635b51e7fbfa17f98c02f47e4a75ee81c02a73a3050a3a3f
MD5 e5f3fa0c3c412d916d33de82968877c7
BLAKE2b-256 e72743dd6a235a92eb4011043352b057d376b3466d0525606233402ea1f6aef2

See more details on using hashes here.

File details

Details for the file gemurl-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: gemurl-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 3.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.2

File hashes

Hashes for gemurl-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 85b1aa96fa8f7ae7b15b92970cafef7f88dc669e996d0b876ffc54059956feb5
MD5 cab1d25b0cd26177f5bc8a780ef4ebf8
BLAKE2b-256 0c31bb82069af02424414299ce79cc99a371d64dcae780d41cd5cfee2965bc16

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page