Skip to main content

No project description provided

Project description

PDF Link Checker

This is a fork of the pdf-link-checker.

Situation: You need to upload a PDF somewhere

Now, you want to check if all the links are still active and that the reviewers, reader, or students end up with 404 error codes. Let this script check that for you!

Setup

  1. Install Python

  2. Install the python-pdf-link-checker via the Python Package Registry.

    pip install python-pdf-link-checker
    

    Attention: On macOS, pip is usually the installer of the Python2 instance. Please use pip3 or pip3.x in this case.

  3. Now you should be able to call pdf-link-checker within your shell.

    $ pdf-link-checker --version
    pdf-link-checker 1.1.5
    

Usage

Check Links

$ pdf-link-checker check-links --help
Usage: pdf-link-checker check-links [OPTIONS] [PDF_FILE]

  - Get input PDF and output CSV location. - execute
  check_pdf_links(infilepath, infilepath) - Save the report to output CSV
  location.

Arguments:
  [PDF_FILE]  The PDF file to check.

Options:
  -r, --report FILE          The CSV file with all the checked links.
                             [default: report.csv]

  -I, --ignore-url TEXT      URL that should not be checked, e.g., because we
                             now that they are not activated yet.  [default: ]

  -C, --ci                   If set, the command will exit with an error code
                             if there are broken URLs.  [default: False]

  -c, --csv-delimiter TEXT   The CSV delimiter, e.g., `;`  [default: ;]
  -A, --ignore-unauthorized  If this flag is set, we will ignore 403 status
                             codes. Some websites block scripts, and thus
                             existing links will result in 403 codes.
                             [default: False]

  --help                     Show this message and exit.

Check Page Limit

$ pdf-link-checker check-page-limit --help
Usage: pdf-link-checker check-page-limit [OPTIONS] [PDF_FILE]

  Check the page limit.

Arguments:
  [PDF_FILE]  The PDF file to check.

Options:
  -l, --page-limit INTEGER  The maximal number of pages
  --help                    Show this message and exit.

Example

$ pdf-link-checker check-links main.pdf
Starting
100%|█████████| 5/5 [00:30<00:00,  6.18s/it]
Done: .../report.csv

Run Pytest to validate returns

From the script directory, run pytest to validate the code. The tests use the PDFs in the data folder.

Contact

If you have any question, please contact Patrick Stöckle.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python_pdf_link_checker-1.1.10.tar.gz (7.3 kB view details)

Uploaded Source

Built Distribution

python_pdf_link_checker-1.1.10-py3-none-any.whl (8.8 kB view details)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page