Skip to main content

automatically create bookmarks in a PDF file

Project description

pdf_scout

This CLI tool automatically generates PDF bookmarks (also known as an 'outline' or a 'table of contents') for computer-generated PDF documents.

You can install it globally via pip:

pip install pdf_scout
pdf_scout ./my_document.pdf
pip uninstall pdf_scout

screenshot

This project is a work in progress and will likely only generate accurate bookmarks for documents that conform to the following requirements:

  • Single column of text (not multiple columns)
  • Font size of header text >= font size of body text
  • Header text is justified or left-aligned

Development

This project manages its dependencies using poetry and is only supported for Python ^3.9. After installing poetry and entering the project folder, run the following to install the dependencies:

poetry install

To open a virtualenv in the project folder with the dependencies, run:

poetry shell

To run a script directly, run:

poetry run python ./src/app.py

Tests

There are snapshot tests. Input PDFs are not provided at the moment, so you will have to populate the /pdf folder manually:

poetry run pytest
poetry run pytest --snapshot-update

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf_scout-0.0.3.tar.gz (5.4 kB view hashes)

Uploaded Source

Built Distribution

pdf_scout-0.0.3-py3-none-any.whl (6.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page