A simple tool for producing navigable, highlighted diffs of rendered HTML websites
Project description
Website Diff
website_diff
is a utility that compares two HTML websites, and outputs a diff as a third HTML website.
The diff site has insertion/deletion highlighting, automatic scroll-to-next and scroll-to-previous key bindings,
image diffing, and highlighting of links pointing to diffed pages.
Why would I use website_diff
?
This tool is primarily meant to help see/find differences in websites that are automatically generated from some source documents (e.g. documentation pages, Jupyterbooks, etc) that may not be obvious from source diffs produced by GitHub. This is particularly useful when the source documents run code whose output may silently change, even though the source files remain constant.
Installation
Ensure Rust and Cargo are installed. Instructions can be found here.
You may also need to install the cairo library if your distribution does not include it.
pip install --upgrade pip
pip install website_diff
Command Line Usage
website_diff
takes as an input two folders each containing an index.html
file, as well as the name of a third folder to be created
that will contain the diffed website.
website_diff --old path/to/old/site/ --new path/to/new/site/ --diff path/where/diff/site/will/be/created
If website_diff
runs successfully, the diff website will be available at
path/where/diff/site/will/be/created/index.html
To access the command line interface help documentation, run
website_diff --help
GitHub Actions Usage
website_diff
can be used as part of a GitHub Actions workflow that runs when a website source code repository is updated.
See the example workflow in .github/workflow-templates/example_workflow.yml
. In this example, the workflow:
- Checks out the website repository on the PR branch
- Builds the PR website
- Commits the PR website to the
gh-pages
branch - Checks out the
gh-pages
branch - Builds the diff website using
website_diff
- Commits the diff website to the
gh-pages
branch - Posts a message on the PR thread pointing users to each website
This example template is based on a real usage of website_diff
in the Introduction to Data Science online textbook repository here: https://github.com/UBC-DSCI/introduction-to-datascience-python/blob/main/.github/workflows/deploy_pr_preview.yml
Visual Diff Style
- Text: Diffs are highlighted in green if text was inserted, and red if text was deleted.
- Links to pages with diffs: Any links that point to a page containing diffs are yellow.
- Images: New images have a green border and are highlighted in green, deleted images have a red border and are highlighted in red, and changed images are outlined in yellow with differences highlighted in red.
Keyboard Controls
- When first opening a page with diffs, the browser will scroll to the first diff on the page
- To scroll to the next off-page diff, press the n key
- To scroll to the previous off-page diff, press Shift+n or N
Examples
There are several examples that can demonstrate the kinds of differences that website_diff will detect. To run website_diff on those examples, simply run the bash script run_tests.sh
found within the website_diff repo. The run_tests.sh
script pulls the examples from a separate repo called website_diff_examples
. The folder website_diff_examples/examples
will then contain several folders each representing a different example e.g. lines of text changed, image added, page added, etc. In each of those folders, there will be an old
and prerendered_old
folder for the old website and old website with pre-rendered figures, new
and prerendered_new
for the new website and new website with pre-rendered figures, and lastly diff
for the diffed version of the website with an index.html
file that shows everything that has changed between the old and new versions of the website.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file website_diff-0.1.0.tar.gz
.
File metadata
- Download URL: website_diff-0.1.0.tar.gz
- Upload date:
- Size: 20.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 54a8f301db0c0787506db43c729049b3a293fd0531bb997908b2cf6f996dbae7 |
|
MD5 | f189bfffe3241c287d2901b10278ae17 |
|
BLAKE2b-256 | 1ebc127df82d8a40b40681cde0a57680569ed892510401371c829772905a0a7b |
File details
Details for the file website_diff-0.1.0-cp311-cp311-macosx_11_0_arm64.whl
.
File metadata
- Download URL: website_diff-0.1.0-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 214.0 kB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ab3e4053a29ee4cfd768570177289f49e87b9eea2e73f9174b0bd34dfcc6fd51 |
|
MD5 | 359336bb3c686fc0810ce98be039f685 |
|
BLAKE2b-256 | 282416e9fd3a6be5b58fedd14eeaa47beff7d119e54d66c10fe7049837b1a882 |