Data Object Layer for PDF data

Project description

pdfdol

Data Object Layer for PDF data

To install: pip install pdfdol

Examples

Pdf "Stores"

Get a dict-like object to list and read the pdfs of a folder, as text:

>>> from pdfdol import PdfFilesReader
>>> from pdfdol.tests import get_test_pdf_folder
>>> folder_path = get_test_pdf_folder()
>>> pdfs = PdfFilesReader(folder_path)
>>> sorted(pdfs)
['sample_pdf_1', 'sample_pdf_2']
>>> assert pdfs['sample_pdf_2'] == [
...     'Page 1\nThis is a sample text for testing Python PDF tools.'
... ]

See that the values of a PdfFilesReader are lists of pages. If you need strings (i.e. all the pages together) you can add a decoder like so:

from dol import add_decoder
page_separator = '---------------------'
pdfs = add_decoder(pdfs, decoder=page_separator.join)

If you need this at the level of the class, just do this:

from dol import add_decoder
page_separator = '---------------------'
FilesReader = add_decoder(PdfFilesReader, decoder=page_separator.join)
# and then
pdfs = FilesReader(folder_path)
# ...

If you need to concatinate a bunch of pdfs together, you can do so in many ways. Here's one:

from dol import Files
from pdfdol import concat_pdfs

s = Files('~/Downloads/cosmograph_documentation_pdfs/')
concat_pdfs(s, key_order=sorted)

Get pdf from various sources

Example with a URL

pdf_data = get_pdf("https://pypi.org", src_kind="url")
print("Got PDF data of length:", len(pdf_data))

Example with HTML content

html_content = "<html><body><h1>Hello, PDF!</h1></body></html>"
pdf_data = get_pdf(html_content, src_kind="html")
print("Got PDF data of length:", len(pdf_data))

Example saving to file

filepath = get_pdf("https://pypi.org", egress="output.pdf", src_kind="url")
print("PDF saved to:", filepath)

Project details

Release history Release notifications | RSS feed

This version

0.1.23

Jan 27, 2026

0.1.22

Oct 2, 2025

0.1.21

Sep 8, 2025

0.1.20

Aug 24, 2025

0.1.19

Aug 23, 2025

0.1.18

Aug 22, 2025

0.1.17

Aug 22, 2025

0.1.16

Mar 3, 2025

0.1.15

Mar 2, 2025

0.1.14

Feb 28, 2025

0.1.13

Feb 11, 2025

0.1.12

Jan 9, 2025

0.1.11

Nov 26, 2024

0.1.10

Nov 12, 2024

0.1.9

Jun 11, 2024

0.1.8

Jun 7, 2024

0.1.7

Jun 7, 2024

0.1.6

Jun 3, 2024

0.1.5

May 23, 2024

0.1.4

May 23, 2024

0.1.3

Apr 3, 2024

0.1.2

Jan 2, 2024

0.1.1

Jan 2, 2024

0.1.0

Jan 2, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdfdol-0.1.23.tar.gz (18.1 kB view details)

Uploaded Jan 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pdfdol-0.1.23-py3-none-any.whl (20.4 kB view details)

Uploaded Jan 27, 2026 Python 3

File details

Details for the file pdfdol-0.1.23.tar.gz.

File metadata

Download URL: pdfdol-0.1.23.tar.gz
Upload date: Jan 27, 2026
Size: 18.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for pdfdol-0.1.23.tar.gz
Algorithm	Hash digest
SHA256	`3ff15b4b27e0c52b627f05cca7464dda0de6b0ea140431c4c3c4aeab6c53ab42`
MD5	`3df574b1230c4db77e67620181526724`
BLAKE2b-256	`a789da229b38f4ef38bc0fdc3d687d4a59da7adbba8668399aa8a4c1442dbcdf`

See more details on using hashes here.

File details

Details for the file pdfdol-0.1.23-py3-none-any.whl.

File metadata

Download URL: pdfdol-0.1.23-py3-none-any.whl
Upload date: Jan 27, 2026
Size: 20.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for pdfdol-0.1.23-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ca86390909ff4223e3a8de131dc323beb6eea6bd24c9264f49a287200a27cb6d`
MD5	`e4eb64b99d49d4edf5f724ea8b4999d9`
BLAKE2b-256	`d35f97098a25e77eecaa64534d344673b6a0a6ad0f96d86a51f1f51bc72b919b`

See more details on using hashes here.

pdfdol 0.1.23

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

pdfdol

Examples

Pdf "Stores"

Get pdf from various sources

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes