Skip to main content

pyHtmlProofer - A tool for validating internal & external links in HTML files / Websites

Project description

CI PyPI Version License

pyHTMLProofer

Check for website and static HTML pages for link rot.

Features

pyHTMLProofer can be used on

  1. Static HTML pages (typically generated by an SSG). You can specify either files or directories to be checked.
  2. Webpages, you can specify a URL/link to be checked.

pyHTMLProofer at the moment does the following:

  1. Checks for broken internal links in HTML files
  2. Checks if external links in HTML or website link are valid

You can read more details below in What's Tested? section.

Roadmap

The follower features are under development:

  1. Check for scripts / stylesheets in HTML files
  2. Check for images and alt-text in HTML files
  3. Check entire website using Sitemap

Installation

Install pyHTMLProofer with pip:

pip install pyhtmlproofer

What's tested?

You can configure pyHTMLProofer to check:

  • a file
  • a directory or list of directories
  • a URL / Link

a elements: PyHTMLProofer checks -

  • If the internal links are valid
  • If the internal references (#in-page-links) are valid
  • If the external links are valid

Usage

To check a file:

import pyHtmlProofer
file = "path/to/file1.html"
pyHtmlProofer.file(file).check()

To check a directories:

import pyHtmlProofer
directory_paths = ["path/to/1/file.html", "path/to/2/file.html"]
pyHtmlProofer.directories(directory_paths).check()

To validate URL(s):

import pyHtmlProofer
links = ["https://example.com", "https://cloudbytes.dev"]
pyHtmlProofer.links(links).check()

Available Config Options

PROOFER_DEFAULTS = {
    "assume_extension": ".html",
    "directory_index_file": "index.html",
    "disable_external": False,
    "ignore_files": [],
    "ignore_urls": [],
    "enforce_https": True,
    "extensions": [".html"],
    "log_level": "INFO",
}

You can override the default configuration options by passing a dictionary of options.

import pyHtmlProofer

options = {"log_level": "ERROR", "disable_external": True}
directory_paths = ["path/to/1/file.html", "path/to/2/file.html"]

pyHtmlProofer.directories(directory_paths, , options=options).check()

Credits

The inspiration was by Ruby based HTMLProofer and lack of Python based alternatives. Although, pyHTMLProofer is not a Python rewrite, instead it focuses on solving problems that I encountered while maintaining CloudBytes/Dev> website.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhtmlproofer-0.2.9a0.tar.gz (23.5 kB view hashes)

Uploaded Source

Built Distribution

pyhtmlproofer-0.2.9a0-py3-none-any.whl (21.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page