automatically create bookmarks in a PDF file
Project description
pdf_scout
This CLI tool automatically generates PDF bookmarks (also known as an 'outline' or a 'table of contents') for computer-generated PDF documents.
You can install it globally via pip:
pip install pdf_scout
pdf_scout ./my_document.pdf
pip uninstall pdf_scout
This project is a work in progress and will likely only generate accurate bookmarks for documents that conform to the following requirements:
- Single column of text (not multiple columns)
- Font size of header text >= font size of body text
- Header text is justified or left-aligned
Development
This project manages its dependencies using poetry and is only supported for Python ^3.9. After installing poetry and entering the project folder, run the following to install the dependencies:
poetry install
To open a virtualenv in the project folder with the dependencies, run:
poetry shell
To run a script directly, run:
poetry run python ./src/app.py
Tests
There are snapshot tests. Input PDFs are not provided at the moment, so you will have to populate the /pdf
folder manually:
poetry run pytest
poetry run pytest --snapshot-update
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pdf_scout-0.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 298e4671873633b49f665009fb2afac8f664afc3a86ba5bf8374d89ce04ba6e3 |
|
MD5 | 281956e04077a1c60dcfbdcd44400e22 |
|
BLAKE2b-256 | cb9f8cdd7a0d7da022eb9bf4ab692b3ce5c6216eb8ad8e1bb02e43aee38808f2 |