Page-by-page PDF text parser for Swarmauri using slate3k over local file-path inputs.

These details have not been verified by PyPI

Project description

Swarmauri Logo

Swarmauri Parser Slate

swarmauri_parser_slate is the Swarmauri PDF parser for page-by-page text extraction using slate3k, a lightweight wrapper around PDFMiner. It reads a local PDF path, extracts text for each page, and returns Swarmauri Document objects with source and page metadata.

Why Use Swarmauri Parser Slate

Parse text-based PDFs into page-scoped Document objects for chunking, retrieval, and downstream agent workflows.
Keep document ingestion aligned with the Swarmauri parser interface.
Use a small PDF extraction dependency when slate3k is sufficient for the target document set.
Preserve page numbers so later indexing, annotation, or citation workflows can map text back to the source file.

FAQ

What input does this parser accept?
A local PDF file path as a string.

Does it support raw PDF bytes?
No. The current implementation is path-only and raises TypeError for other input types.

What does it return?
A list of Swarmauri Document objects, usually one per extracted page.

Does it perform OCR on scanned PDFs?
No. It is intended for PDFs that already contain extractable text.

Features

Page-by-page PDF text extraction through slate3k.
Returns Document objects with page_number and source metadata.
Provides a clear TypeError for unsupported input types.
Fits Swarmauri ingestion, parsing, and retrieval pipelines.
Supports Python 3.10, 3.11, 3.12, 3.13, and 3.14.

Installation

uv add swarmauri_parser_slate

pip install swarmauri_parser_slate

Usage

from swarmauri_parser_slate import SlateParser

parser = SlateParser()
documents = parser.parse("pdfs/handbook.pdf")

for document in documents:
    print(document.metadata["page_number"], document.content[:120])

Examples

Parse a handbook PDF

from swarmauri_parser_slate import SlateParser

parser = SlateParser()
pages = parser.parse("manuals/employee-handbook.pdf")

for page in pages:
    print(page.metadata["page_number"], len(page.content))

Handle missing files and invalid inputs

from swarmauri_parser_slate import SlateParser

parser = SlateParser()

print(parser.parse("missing.pdf"))

try:
    parser.parse(b"%PDF-1.7 ...")
except TypeError as exc:
    print(exc)

Related Packages

Swarmauri Foundations

Best Practices

Use this parser for PDFs that already contain selectable text.
Route scan-only or image-based PDFs through OCR before parsing.
Keep page-granular output when later stages need per-page provenance.
Validate representative PDFs first because extraction quality depends on the original PDF structure.

License

This project is licensed under the Apache-2.0 License.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.11.0.dev1 pre-release

Jun 30, 2026

0.3.1.dev3 pre-release

May 20, 2026

0.3.1.dev2 pre-release

May 20, 2026

0.3.0

Mar 24, 2026

0.3.0.dev7 pre-release

Mar 23, 2026

0.3.0.dev5 pre-release

Mar 20, 2026

0.3.0.dev4 pre-release

Mar 20, 2026

0.3.0.dev3 pre-release

Mar 20, 2026

0.3.0.dev2 pre-release

Mar 20, 2026

0.3.0.dev1 pre-release

Mar 20, 2026

0.2.3.dev18 pre-release

Mar 20, 2026

0.2.3.dev17 pre-release

Mar 20, 2026

0.2.3.dev10 pre-release

Feb 23, 2026

0.2.3.dev5 pre-release

Feb 18, 2026

0.2.3.dev4 pre-release

Feb 17, 2026

0.2.3.dev3 pre-release

Feb 17, 2026

0.2.2

Feb 17, 2026

0.2.2.dev7 pre-release

Feb 17, 2026

0.2.2.dev6 pre-release

Feb 12, 2026

0.2.0

Jan 28, 2026

0.2.0.dev23 pre-release

Jan 27, 2026

0.2.0.dev4 pre-release

Sep 11, 2025

0.2.0.dev3 pre-release

Sep 10, 2025

0.2.0.dev2 pre-release

Sep 10, 2025

0.1.1

May 23, 2025

0.1.1.dev1 pre-release

May 23, 2025

0.1.0

May 23, 2025

0.1.0.dev20 pre-release

May 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

swarmauri_parser_slate-0.11.0.dev1.tar.gz (8.0 kB view details)

Uploaded Jun 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

swarmauri_parser_slate-0.11.0.dev1-py3-none-any.whl (8.9 kB view details)

Uploaded Jun 30, 2026 Python 3

File details

Details for the file swarmauri_parser_slate-0.11.0.dev1.tar.gz.

File metadata

Download URL: swarmauri_parser_slate-0.11.0.dev1.tar.gz
Upload date: Jun 30, 2026
Size: 8.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.26 {"installer":{"name":"uv","version":"0.11.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for swarmauri_parser_slate-0.11.0.dev1.tar.gz
Algorithm	Hash digest
SHA256	`d3d998d024053a342acde2fb7ac9275e1a58b95c80edd38f60d284b1fc04bd35`
MD5	`9d58f063cdae52e3be950e5131317099`
BLAKE2b-256	`3d8b2b42466196b77c3f61baec3cbe0961f4130ed61cd7c2ee5dc926f8cac851`

See more details on using hashes here.

File details

Details for the file swarmauri_parser_slate-0.11.0.dev1-py3-none-any.whl.

File metadata

Download URL: swarmauri_parser_slate-0.11.0.dev1-py3-none-any.whl
Upload date: Jun 30, 2026
Size: 8.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.26 {"installer":{"name":"uv","version":"0.11.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for swarmauri_parser_slate-0.11.0.dev1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`60fead509bd2927639dc606755ef92c54d242fb0d5f2356ba52684c4038c5287`
MD5	`9971fe92255e01dc23bc7aba03486e64`
BLAKE2b-256	`ea17817257d5d87cb3e135be780df1610082888bf05b856d2a29199a32fd6b4c`

See more details on using hashes here.

swarmauri_parser_slate 0.11.0.dev1

Navigation

Verified details

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

Swarmauri Parser Slate

Why Use Swarmauri Parser Slate

FAQ

Features

Installation

Usage

Examples

Parse a handbook PDF

Handle missing files and invalid inputs

Related Packages

Swarmauri Foundations

More Documentation

Best Practices

License

Project details

Verified details

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes