Skip to main content

A self-contained, easily reusable library for parsing, manipulating,

Project description

The lazr.uri package includes code for parsing and dealing with URIs.

>>> import lazr.uri
>>> print('VERSION:', lazr.uri.__version__)
VERSION: ...

The URI class

>>> from lazr.uri import URI
>>> uri1 = URI('http://localhost/foo/bar?123')
>>> uri2 = URI('http://localhost/foo/bar/baz')
>>> uri1.contains(uri2)
True

These next two are equivalent, so the answer should be True, even through the “outside” one is shorter than the “inside” one.

>>> uri1 = URI('http://localhost/foo/bar/')
>>> uri2 = URI('http://localhost/foo/bar')
>>> uri1.contains(uri2)
True

The next two are exactly the same. We consider a url to be inside itself.

>>> uri1 = URI('http://localhost/foo/bar/')
>>> uri2 = URI('http://localhost/foo/bar/')
>>> uri1.contains(uri2)
True

In the next case, the string of url2 starts with the string of url1. But, because url2 continues within the same path step, url2 is not inside url1.

>>> uri1 = URI('http://localhost/foo/ba')
>>> uri2 = URI('http://localhost/foo/bar')
>>> uri1.contains(uri2)
False

Here, url2 is url1 plus an extra path step. So, url2 is inside url1.

>>> uri1 = URI('http://localhost/foo/bar/')
>>> uri2 = URI('http://localhost/foo/bar/baz')
>>> uri1.contains(uri2)
True

Once the URI is parsed, its parts are accessible.

>>> uri = URI('https://fish.tree:8666/blee/blah')
>>> uri.scheme
'https'
>>> uri.host
'fish.tree'
>>> uri.port
'8666'
>>> uri.authority
'fish.tree:8666'
>>> uri.path
'/blee/blah'
>>> uri = URI('https://localhost/blee/blah')
>>> uri.scheme
'https'
>>> uri.host
'localhost'
>>> uri.port is None
True
>>> uri.authority
'localhost'
>>> uri.path
'/blee/blah'

The grammar from RFC 3986 does not allow for square brackets in the query component, but Section 3.4 does say how such delimeter characters should be handled if found in the component.

>>> uri = URI('http://www.apple.com/store?delivery=[slow]#horse+cart')
>>> uri.scheme
'http'
>>> uri.host
'www.apple.com'
>>> uri.port is None
True
>>> uri.path
'/store'
>>> uri.query
'delivery=[slow]'
>>> uri.fragment
'horse+cart'

Finding URIs in Text

lazr.uri also knows how to retrieve a list of URIs from a block of text. This is intended for uses like finding bug tracker URIs or similar.

The find_uris_in_text() function returns an iterator that yields URI objects for each URI found in the text. Note that the returned URIs have been canonicalised by the URI class:

>>> from lazr.uri import find_uris_in_text
>>> text = '''
... A list of URIs:
...  * http://localhost/a/b
...  * http://launchpad.net
...  * MAILTO:joe@example.com
...  * xmpp:fred@example.org
...  * http://bazaar.launchpad.net/%7ename12/firefox/foo
...  * http://somewhere.in/time?track=[02]#wasted-years
... '''
>>> for uri in find_uris_in_text(text):
...     print(uri)
http://localhost/a/b
http://launchpad.net/
mailto:joe@example.com
xmpp:fred@example.org
http://bazaar.launchpad.net/~name12/firefox/foo
http://somewhere.in/time?track=[02]#wasted-years

NEWS for lazr.uri

1.0.6 (2021-09-13)

  • Adjust versioning strategy to avoid importing pkg_resources, which is slow in large environments.

1.0.5 (2020-06-29)

  • Add an explicit __hash__ method to lazr.uri.URI.

1.0.4 (2020-06-12)

1.0.3 (2012-01-18)

  • Add compatibility with Python 3 (Thomas Kluyver).

1.0.1 (2009-06-01)

  • Eliminate dependency on setuptools_bzr so sdists do not bring bzr ini, among others.

1.0 (2009-03-23)

  • Initial release on PyPI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lazr.uri-1.0.6.tar.gz (18.2 kB view details)

Uploaded Source

File details

Details for the file lazr.uri-1.0.6.tar.gz.

File metadata

  • Download URL: lazr.uri-1.0.6.tar.gz
  • Upload date:
  • Size: 18.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.8.10

File hashes

Hashes for lazr.uri-1.0.6.tar.gz
Algorithm Hash digest
SHA256 5026853fcbf6f91d5a6b11ea7860a641fe27b36d4172c731f4aa16b900cf8464
MD5 44c032bb0c78a6f249b8ae4b64bd6b4f
BLAKE2b-256 a6db310eaccd3639f5a8a6011c3133bb1cac7fd80bb46f8a50406df2966302e4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page