Skip to main content

Find your broken links, so users don't.

Project description

PyAnchor

PyPI version GitHub

Dead links are an annoyance for websites with an extensive amount of content. A side from the negative impact on SEO, dead links are an annoyance for any user that clicks on one.

PyAnchor is primarily for checking the HTTP response on all links on a page. You can integrate it into your development workflow so that users never see a 404 in the first place.

Install

Requires Python 3.8 and above.

It is recommended that you install this package in a virtual or isoloated environment. The easiest way to do this is with pipx.

pipx install pyanchor

Alternatively, you can install it with pip into your virtual environment:

MacOS / Linux:

python3 -m pip install pyanchor

Windows:

python -m pip install pyanchor

Using the CLI

The CLI can be invoked with the pyanchor command. A URL must be provided unless it's the help page.

To get the help page:

pyanchor --help

Example Gif

Basic example for a single page:

Note: all provided URLs must include a valid HTTP scheme.

pyanchor https://mysite.com/

Example Gif

If you want to check all links on a website, and not just a single page, a sitemap.xml URL may be provided and flagged with --sitemap.

Example:

pyanchor https://mysite.com/sitemap.xml --sitemap

Example Gif

By default, successful requests are not printed to the terminal. To see all urls with a 200 response add the --verbose flag.

pyanchor https://mysite.com --verbose

Example Gif

pyanchor https://mysite.com/sitemap.xml --sitemap --verbose

Example Gif

If you wish the output the results to a csv file, instead of to the terminal (default), then you may wish to use the --output-csv flag:

pyanchor https://mysite.com --output-csv output/path/to/file

But wait, there's more...

To integrate PyAnchor into your application, you can import the LinkResults class. LinkResults requires a URL.

Example:

>>> from pyanchor.link_checker import LinkResults
>>> r = LinkResults("https://mysite.com/")
>>> r.results
{200: ["https://mysite.com/about/", "https://mysite.com/contact/"], 500: ["https://mysite.com/doh!/"]}

As you can see the results attribute is a dictionary containing all response codes returned as a dictionary key, with a list of URLs that achieve that response code as the dictionary value.

Analyzing Links

PyAnchor give you the ability to use the LinkAnalysis class to check the links in a given URL for unsafe and obsolete attributes.

To check for obsolete attributes use the obsolete_attrs property:

>>> from pyanchor.link_checker import LinkAnalysis
>>> r = LinkAnalysis("https://mysite.com/")
>>> r.obsolete_attrs
{'/about/link-1': ['charset', 'rev'], '/about/link-2': ['name']}

Likewise you can check for unsafe linkes with unsafe_attrs:

>>> from pyanchor.link_checker import LinkAnalysis
>>> r = LinkAnalysis("https://mysite.com/")
>>> r.unsafe_attrs
{<a href="/about/link-4" target="_blank">Link 4</a>: True, <a href="/about/link-5" rel="noreferrer noopener" target="_blank">Link 5</a>: False}

Any link that does not include rel="noopener" when the target attribute is used will return True. As in, it is True that this link is unsafe. Therfore, links with appropriate attributes will return False.

Feedback

If you find a bug, please file an issue.

If you have feature requests, please file an issue and use the appropriate label.

Support

If you would like to show your support for the project, you can sponsor me on Github? 🤓

How to Contribute

Please raise an issue before making a PR, so that the issue and implementation can be discussed before you write any code. This will save you time, and increase the chances of your PR being merged without significant changes.

Please make PR's on a new branch, and not on main/master.

Please include tests for any PR's that include code (unless current tests cover your code contribution).

Please add documentation for any new features or flags.

Contributors

Thank you to:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyanchor-0.8.1.tar.gz (287.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyanchor-0.8.1-py3-none-any.whl (7.7 kB view details)

Uploaded Python 3

File details

Details for the file pyanchor-0.8.1.tar.gz.

File metadata

  • Download URL: pyanchor-0.8.1.tar.gz
  • Upload date:
  • Size: 287.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pyanchor-0.8.1.tar.gz
Algorithm Hash digest
SHA256 aeee1d36f06ba069625115f240e5aa4485239b6006f684b424c1b2b28faf8bff
MD5 2c0852cc5a4f99c594efe3c80c9fd40f
BLAKE2b-256 474027a098f51bfb3b33ec91be0b655aa7e119c904830d1d54d79db48c34f9ae

See more details on using hashes here.

File details

Details for the file pyanchor-0.8.1-py3-none-any.whl.

File metadata

  • Download URL: pyanchor-0.8.1-py3-none-any.whl
  • Upload date:
  • Size: 7.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pyanchor-0.8.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3614c90a23cf603a3be389c35e1ac2345f39fd7273da778470787794a3cabbb4
MD5 87a97c692517213148eff8d03d636ff3
BLAKE2b-256 a7efca33c293589f968dc204c0fb33f3bbae94978218dc0677501fa101d769a8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page