automatically create bookmarks in a PDF file
Project description
pdf_scout
This CLI tool automatically generates PDF bookmarks (also known as an 'outline' or a 'table of contents') for computer-generated PDF documents.
You can install it globally via pip:
pip install pdf_scout
pdf_scout ./my_document.pdf
pip uninstall pdf_scout
This project is a work in progress and will likely only generate suitable bookmarks for documents that conform to the following requirements:
- Single column of text (not multiple columns)
- Font size of header text > font size of body text
- Header text is justified or left-aligned
- Paragraph spacing for headers > body text paragraph spacing
- Consistent left margins on every page
Supported document types
pdf_scout
has been tested on and expressly supports the following classes of documents:
- Singapore State Court and Supreme Court Judgments (unreported)
- Singapore Law Reports
It may support other types of documents as well. If a particular class of document isn't supported or does not work well, please open an issue and I will consider adding support for it.
Development
This project manages its dependencies using poetry and is only supported for Python ^3.9. After installing poetry and entering the project folder, run the following to install the dependencies:
poetry install
To open a virtualenv in the project folder with the dependencies, run:
poetry shell
To run a script directly, run:
poetry run python ./src/app.py
Tests
There are snapshot tests. Input PDFs are not provided at the moment, so you will have to populate the /pdf
folder manually:
poetry run pytest
poetry run pytest --snapshot-update
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pdf_scout-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 015155d23aa974e209ca200b83f1c66856fe49f73bb947a7781520a6c5a8268d |
|
MD5 | b888bae4c6b940462d3e5c80926f059f |
|
BLAKE2b-256 | 95c080ae86cc8ff6ec0bf5edf7bc78eb0209748b911b899bc7112181db54d6e3 |