PDF AcroForm field parser for Swarmauri using PyPDFTK and the native pdftk toolchain.

These details have not been verified by PyPI

Project description

Swarmauri Logo

Swarmauri Parser PyPDFTK

swarmauri_parser_pypdftk is the Swarmauri PDF form-field parser for extracting AcroForm data through PyPDFTK and the native pdftk toolchain. It converts structured PDF form fields into a single Swarmauri Document so downstream workflows can index, validate, or route filled forms.

Why Use Swarmauri Parser PyPDFTK

Extract structured PDF field data instead of only free-form text.
Normalize AcroForm output into a Swarmauri Document for ingestion, automation, and analysis pipelines.
Keep form parsing aligned with the same Swarmauri parser interface used by other document components.
Pair form-field extraction with other PDF parsers when both structured fields and page text matter.

FAQ

What does this parser extract?
PDF form fields returned by pypdftk.dump_data_fields, such as AcroForm names and values.

Does it parse ordinary PDF text?
No. This package is for structured PDF form fields. Use another parser for general page text.

Does it need a system binary?
Yes. It depends on the pdftk or pdftk-java executable being installed and available on PATH.

What happens when the PDF has no form fields?
The parser returns an empty list.

Features

Extracts PDF AcroForm fields through PyPDFTK.
Returns one Swarmauri Document with newline-delimited key: value content.
Preserves the input source path in metadata.
Useful for form ingestion, validation, compliance workflows, and automation.
Supports Python 3.10, 3.11, 3.12, 3.13, and 3.14.

Installation

uv add swarmauri_parser_pypdftk

pip install swarmauri_parser_pypdftk

System requirement:

Install pdftk or pdftk-java and make sure the executable is available on PATH.

Usage

from swarmauri_parser_pypdftk import PyPDFTKParser

parser = PyPDFTKParser()
documents = parser.parse("forms/enrollment.pdf")

for document in documents:
    print(document.metadata["source"])
    print(document.content)

Examples

Extract form fields from a filled PDF

from swarmauri_parser_pypdftk import PyPDFTKParser

parser = PyPDFTKParser()
docs = parser.parse("forms/application.pdf")

if docs:
    print(docs[0].content)

Example output:

GivenName: John
FamilyName: Doe
BirthDate: 1990-01-01

Detect forms without field data

from swarmauri_parser_pypdftk import PyPDFTKParser

parser = PyPDFTKParser()
docs = parser.parse("forms/plain.pdf")

if not docs:
    print("No PDF form fields were detected.")

Related Packages

Swarmauri Foundations

Best Practices

Use this parser for PDFs with real AcroForm fields, not for generic PDF page text.
Validate that the pdftk binary is installed in deployment targets before running pipelines that depend on this package.
Pair this package with swarmauri_parser_pypdf2 or swarmauri_parser_fitzpdf if you also need free-form page text.
Route scan-only documents through OCR if they are image-based and contain no useful form structure.

License

This project is licensed under the Apache-2.0 License.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.11.0.dev1 pre-release

Jun 30, 2026

0.9.1.dev3 pre-release

May 20, 2026

0.9.1.dev2 pre-release

May 20, 2026

0.9.0

Mar 24, 2026

0.9.0.dev7 pre-release

Mar 23, 2026

0.9.0.dev5 pre-release

Mar 20, 2026

0.9.0.dev4 pre-release

Mar 20, 2026

0.9.0.dev3 pre-release

Mar 20, 2026

0.9.0.dev2 pre-release

Mar 20, 2026

0.9.0.dev1 pre-release

Mar 20, 2026

0.8.3.dev18 pre-release

Mar 20, 2026

0.8.3.dev17 pre-release

Mar 20, 2026

0.8.3.dev10 pre-release

Feb 23, 2026

0.8.3.dev5 pre-release

Feb 18, 2026

0.8.3.dev4 pre-release

Feb 17, 2026

0.8.3.dev3 pre-release

Feb 17, 2026

0.8.2

Feb 17, 2026

0.8.2.dev7 pre-release

Feb 17, 2026

0.8.2.dev6 pre-release

Feb 12, 2026

0.8.0

Jan 28, 2026

0.8.0.dev21 pre-release

Jan 27, 2026

0.8.0.dev4 pre-release

Sep 11, 2025

0.8.0.dev3 pre-release

Sep 10, 2025

0.8.0.dev2 pre-release

Sep 10, 2025

0.7.5

May 23, 2025

0.7.5.dev1 pre-release

May 23, 2025

0.7.4

May 23, 2025

0.7.4.dev20 pre-release

May 23, 2025

0.7.3

Mar 31, 2025

0.7.3.dev2 pre-release

Mar 31, 2025

0.7.2

Mar 6, 2025

0.7.2.dev3 pre-release

Mar 6, 2025

0.7.2.dev2 pre-release

Mar 6, 2025

0.7.2.dev1 pre-release

Mar 6, 2025

0.7.1

Mar 6, 2025

0.7.1.dev1 pre-release

Mar 5, 2025

0.7.0

Mar 4, 2025

0.7.0.dev12 pre-release

Mar 4, 2025

0.7.0.dev11 pre-release

Mar 4, 2025

0.7.0.dev10 pre-release

Mar 4, 2025

0.7.0.dev9 pre-release

Mar 4, 2025

0.7.0.dev8 pre-release

Mar 4, 2025

0.7.0.dev7 pre-release

Mar 4, 2025

0.7.0.dev6 pre-release

Mar 4, 2025

0.7.0.dev5 pre-release

Mar 4, 2025

0.7.0.dev4 pre-release

Mar 4, 2025

0.7.0.dev3 pre-release

Mar 4, 2025

0.7.0.dev2 pre-release

Mar 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

swarmauri_parser_pypdftk-0.11.0.dev1.tar.gz (8.1 kB view details)

Uploaded Jun 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

swarmauri_parser_pypdftk-0.11.0.dev1-py3-none-any.whl (9.0 kB view details)

Uploaded Jun 30, 2026 Python 3

File details

Details for the file swarmauri_parser_pypdftk-0.11.0.dev1.tar.gz.

File metadata

Download URL: swarmauri_parser_pypdftk-0.11.0.dev1.tar.gz
Upload date: Jun 30, 2026
Size: 8.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.26 {"installer":{"name":"uv","version":"0.11.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for swarmauri_parser_pypdftk-0.11.0.dev1.tar.gz
Algorithm	Hash digest
SHA256	`4b66699514e398ba6613868d77125f20ea618e918a6103d6a98562f9d3dbe6a4`
MD5	`4b5bab06364b18d88c78aaae2ee15c7b`
BLAKE2b-256	`62c9afce8099c2dc2bdf84b696d3acdddb3e86442bcea3621606a9cb51851f81`

See more details on using hashes here.

File details

Details for the file swarmauri_parser_pypdftk-0.11.0.dev1-py3-none-any.whl.

File metadata

Download URL: swarmauri_parser_pypdftk-0.11.0.dev1-py3-none-any.whl
Upload date: Jun 30, 2026
Size: 9.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.26 {"installer":{"name":"uv","version":"0.11.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for swarmauri_parser_pypdftk-0.11.0.dev1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`218905d25b8ffc368d190c0bc5d330e2b5c35d74b3e96c7102f9121c5c377902`
MD5	`f27ee33ad45614e525fb73684f194786`
BLAKE2b-256	`0a9cb091f3714b1649793695e9826c89ac6d7138ff7eabf1cc8bdce52bfb83de`

See more details on using hashes here.

swarmauri_parser_pypdftk 0.11.0.dev1

Navigation

Verified details

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

Swarmauri Parser PyPDFTK

Why Use Swarmauri Parser PyPDFTK

FAQ

Features

Installation

Usage

Examples

Extract form fields from a filled PDF

Detect forms without field data

Related Packages

Swarmauri Foundations

More Documentation

Best Practices

License

Project details

Verified details

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes