Skip to main content

Library to find URLs and check their validity.

Project description

urlfinderlib

Python library for finding URLs in documents and arbitrary data and checking their validity.

Basic usage

from urlfinderlib import find_urls

with open('/path/to/file', 'rb') as f:
    print(find_urls(f.read())

base_url usage

If you are trying to find URLs inside of an HTML file, the paths in the URLs are likely relative to their location on the server hosting the HTML. You can use the base_url parameter in this case to extract these "relative" URLs.

from urlfinderlib import find_urls

with open('/path/to/file', 'rb') as f:
    print(find_urls(f.read(), base_url='http://somewebsite.com/')

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

urlfinderlib-0.11.11.tar.gz (10.8 kB view details)

Uploaded Source

Built Distribution

urlfinderlib-0.11.11-py3-none-any.whl (14.4 kB view details)

Uploaded Python 3

File details

Details for the file urlfinderlib-0.11.11.tar.gz.

File metadata

  • Download URL: urlfinderlib-0.11.11.tar.gz
  • Upload date:
  • Size: 10.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.9

File hashes

Hashes for urlfinderlib-0.11.11.tar.gz
Algorithm Hash digest
SHA256 4a06f3fae39213f1a1753ed3949784cf790d1120d5a1d74ea19969ecbc50b308
MD5 5d7f5ab950d4a8551257c2602acac052
BLAKE2b-256 9dc9dc1da32ddc6d8ba652b0cf543db7dd12f347e59eb2caa32a3a736f67389c

See more details on using hashes here.

File details

Details for the file urlfinderlib-0.11.11-py3-none-any.whl.

File metadata

  • Download URL: urlfinderlib-0.11.11-py3-none-any.whl
  • Upload date:
  • Size: 14.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.9

File hashes

Hashes for urlfinderlib-0.11.11-py3-none-any.whl
Algorithm Hash digest
SHA256 eac6bf05aefc4c40bd565ced40cc38540f5ed4174d8716863d4c46e0bcf5ca99
MD5 8d0dfbf96e2e9ab5fe512c35f408ccdf
BLAKE2b-256 04e765af6843be60a9c00d2580cf755bf5acdc17f71f2e5a4e24e36b67098efb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page