Skip to main content

Parsing and validation of URIs (RFC 3896) and IRIs (RFC 3987)

Project description

This module provides regular expressions according to RFC 3986 “Uniform Resource Identifier (URI): Generic Syntax” and RFC 3987 “Internationalized Resource Identifiers (IRIs)”, and utilities for composition and relative resolution of references:

patterns

A dict of regular expressions keyed by rule names for URIs and rule names for IRIs.

>>> u = regex.compile('^%s$' % patterns['URI'])
>>> m = u.match(u'http://tools.ietf.org/html/rfc3986#appendix-A')
>>> assert m.groupdict() == dict(scheme=u'http',
...                              authority=u'tools.ietf.org',
...                              userinfo=None, host=u'tools.ietf.org',
...                              port=None, path=u'/html/rfc3986',
...                              query=None, fragment=u'appendix-A')
>>> assert not u.match(u'urn:\U00010300')
>>> assert regex.match('^%s$' % patterns['IRI'], u'urn:\U00010300')
>>> assert not regex.match('^%s$' % patterns['relative_ref'], '#f#g')
compose
Returns an URI composed from named parts.
resolve

Resolves an URI reference relative to a base URI.

Test cases:

>>> base = "http://a/b/c/d;p?q"
>>> for relative, resolved in {
...     "g:h"           :  "g:h",
...     "g"             :  "http://a/b/c/g",
...     "./g"           :  "http://a/b/c/g",
...     "g/"            :  "http://a/b/c/g/",
...     "/g"            :  "http://a/g",
...     "//g"           :  "http://g",
...     "?y"            :  "http://a/b/c/d;p?y",
...     "g?y"           :  "http://a/b/c/g?y",
...     "#s"            :  "http://a/b/c/d;p?q#s",
...     "g#s"           :  "http://a/b/c/g#s",
...     "g?y#s"         :  "http://a/b/c/g?y#s",
...     ";x"            :  "http://a/b/c/;x",
...     "g;x"           :  "http://a/b/c/g;x",
...     "g;x?y#s"       :  "http://a/b/c/g;x?y#s",
...     ""              :  "http://a/b/c/d;p?q",
...     "."             :  "http://a/b/c/",
...     "./"            :  "http://a/b/c/",
...     ".."            :  "http://a/b/",
...     "../"           :  "http://a/b/",
...     "../g"          :  "http://a/b/g",
...     "../.."         :  "http://a/",
...     "../../"        :  "http://a/",
...     "../../g"       :  "http://a/g",
...     "../../../g"    :  "http://a/g",
...     "../../../../g" :  "http://a/g",
...     "/./g"          :  "http://a/g",
...     "/../g"         :  "http://a/g",
...     "g."            :  "http://a/b/c/g.",
...     ".g"            :  "http://a/b/c/.g",
...     "g.."           :  "http://a/b/c/g..",
...     "..g"           :  "http://a/b/c/..g",
...     "./../g"        :  "http://a/b/g",
...     "./g/."         :  "http://a/b/c/g/",
...     "g/./h"         :  "http://a/b/c/g/h",
...     "g/../h"        :  "http://a/b/c/h",
...     "g;x=1/./y"     :  "http://a/b/c/g;x=1/y",
...     "g;x=1/../y"    :  "http://a/b/c/y",
...     }.iteritems():
...     assert resolve(base, relative) == resolved

If return_parts is True, returns a dict of named parts instead of a string.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for rfc3987, version 1.2.1
Filename, size File type Python version Upload date Hashes
Filename, size rfc3987-1.2.1.tar.gz (5.3 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page