Check whether a URL is live, dead, or likely hallucinated, with Wayback Machine fallback
Project description
urlhealth
A Python library to check whether a URL is live, dead, or likely hallucinated. When a URL is dead, it automatically checks the Wayback Machine for an archived snapshot.
Installation
pip install urlhealth
Dependencies
Usage
from urlhealth import inspect, URLStatus
result = inspect("https://example.com")
print(result["url_status"]) # URLStatus.LIVE, .DEAD, .UNKNOWN, or .LIKELY_HALLUCINATED
print(result["status_code"]) # HTTP status code (int or None)
print(result["wayback_url"]) # Wayback Machine archive URL (str or None)
Return value
inspect(url, timeout=10) returns a dict with three keys:
| Key | Type | Description |
|---|---|---|
url_status |
URLStatus |
LIVE (200), DEAD (404 with Wayback snapshot), LIKELY_HALLUCINATED (404 without snapshot), or UNKNOWN (other status / connection error) |
status_code |
int | None |
HTTP status code, or None if the request failed |
wayback_url |
str | None |
Archive URL when status is DEAD, otherwise None |
URLStatus enum
class URLStatus(str, Enum):
LIVE = "LIVE"
DEAD = "DEAD"
UNKNOWN = "UNKNOWN"
LIKELY_HALLUCINATED = "LIKELY_HALLUCINATED"
Since URLStatus is a str enum, you can compare directly with strings:
if result["url_status"] == "LIVE":
print("URL is reachable")
How it works
- Sends an HTTP
HEADrequest to the URL (falls back toGETif the server returns 405, 403, or 501). - If the response is 200, the URL is
LIVE. - If the response is 404, queries the Wayback Machine API:
- Archived snapshot found →
DEAD(withwayback_urlpopulated). - No snapshot →
LIKELY_HALLUCINATED.
- Archived snapshot found →
- Any other status code or connection error →
UNKNOWN.
License
See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file urlhealth-0.1.0.tar.gz.
File metadata
- Download URL: urlhealth-0.1.0.tar.gz
- Upload date:
- Size: 5.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
162b3b83806aee09eac348d8290089c19e84f1ea50cc633701d969f46c055246
|
|
| MD5 |
aaeae05a962f5cca6fd87eab6b061fcc
|
|
| BLAKE2b-256 |
832ae4875a2a5e1f011e98f24b3bc238f3a76340b28ffbd780f99164befd175f
|
File details
Details for the file urlhealth-0.1.0-py3-none-any.whl.
File metadata
- Download URL: urlhealth-0.1.0-py3-none-any.whl
- Upload date:
- Size: 4.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8681ea39e4ee462298af6acd29f8bd9a929a4fb6da3b24d40b50be89a500af1b
|
|
| MD5 |
031fe6a794c65ca0d14fb2e8d4f31975
|
|
| BLAKE2b-256 |
8c5a3d5e72ddeef8ed2186bc6c2042d137c654827801dbbcc9023dc9d43fe793
|