pyHtmlProofer - A tool for validating internal & external links in HTML files / Websites
Project description
pyhtmlproofer
Check for website and static HTML pages for link rot.
Features
pyhtmlproofer can be used on
- Static HTML pages (typically generated by an SSG). You can specify either files or directories to be checked.
- Webpages, you can specify a URL/link to be checked.
pyhtmlproofer at the moment does the following:
- Checks for broken internal links in HTML files
- Checks if external links in HTML or website link are valid
- Check for scripts / stylesheets in HTML files
- Check for images in HTML files
You can read more details below in What's Tested? section.
Roadmap
The follower features are under development:
- Check for images and alt-text in HTML files
- Check Favicons
- Check optimal SEO meta tags
- Caching results
- Config file
Installation
Install pyhtmlproofer with pip:
pip install pyhtmlproofer
What's tested?
You can configure pyhtmlproofer to check:
- a file
- a directory or list of directories
- a URL / Link
Links / Hyperlinks
a
, link
elements: pyhtmlproofer checks-
- If the internal links are valid
- If the internal references (
#in-page-links
) are valid - If the external links are valid
Images
img
elements: pyhtmlproofer checks -
- if the internal image references are valid
- if the external image references are valid
Scripts
script
elements: pyhtmlproofer checks -
- If the internal script references are valid
- If the external script references are reachable
Usage
a) To check a file:
import pyhtmlproofer as proofer
file = "path/to/file1.html"
proofer.file(file).check()
b) To check a directories:
import pyhtmlproofer as proofer
directory_paths = ["path/to/1/file.html", "path/to/2/file.html"]
proofer.directories(directory_paths).check()
c) To validate URL(s):
import pyhtmlproofer as proofer
links = ["https://example.com", "https://cloudbytes.dev"]
proofer.links(links).check()
CLI
There is also a CLI that can be used:
$ pyhtmlproofer check -F <file_name>
Available Config Options
PROOFER_DEFAULTS = {
"assume_extension": ".html",
"directory_index_file": "index.html",
"disable_external": False,
"ignore_files": [],
"ignore_urls": [],
"enforce_https": True,
"extensions": [".html"],
"log_level": "ERROR",
"report_to_file": True,
"report_filename": "proofer_report",
}
You can override the default configuration options by passing a dictionary of options.
import pyhtmlproofer as proofer
options = {"log_level": "ERROR", "disable_external": True}
directory_paths = ["path/to/1/file.html", "path/to/2/file.html"]
proofer.directories(directory_paths, , options=options).check()
Credits
The inspiration was by Ruby based HTMLProofer and lack of Python based alternatives. Although, pyhtmlproofer is not a Python rewrite, instead it focuses on solving problems that I encountered while maintaining CloudBytes/Dev> website.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pyHtmlProofer-0.7.3a0.tar.gz
.
File metadata
- Download URL: pyHtmlProofer-0.7.3a0.tar.gz
- Upload date:
- Size: 30.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: pdm/2.1.3 CPython/3.8.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 17ac23ef4cb1dff7fb0fb8e0105164c9cbda5345694c8743990b47f92cc6f3b8 |
|
MD5 | ed0f19291ad9b656f7cd293619c6d45d |
|
BLAKE2b-256 | 9f97dc65d74c96535f88327d87d02a8d55567f17c59531bc96f7886fe7cc7beb |
File details
Details for the file pyHtmlProofer-0.7.3a0-py3-none-any.whl
.
File metadata
- Download URL: pyHtmlProofer-0.7.3a0-py3-none-any.whl
- Upload date:
- Size: 24.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: pdm/2.1.3 CPython/3.8.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5fade45f8aa78930ad82ba061310290e3629262ea6d96f4a9cdf12f55f9231f0 |
|
MD5 | 3b3e147e8ae59dd6e1aa7e8cd2e1c6ed |
|
BLAKE2b-256 | f86297af3c80d61e8c40b3b789c461a975805dea0bc261c57b007a8e9c09173c |