Agentic and single-pass Key Information Extraction (KIE) from documents using LLMs
Project description
Agentic KIE: LLM-Based Key Information Extraction from Documents
A Python package for extracting structured information from PDF documents using large language models.
agentic-kie handles the full extraction pipeline: it loads PDFs (including scanned documents via a pluggable OCR backend), and exposes both the raw text and rendered page images so that LLMs can reason over document content using text, vision, or a combination of both. Two extraction strategies are available — a fast single-pass approach and a more capable agentic loop — designed for use in production pipelines and research workflows alike.
Contents
Installation
Requires Python 3.13 or later.
pip install agentic-kie
Or with uv:
uv add agentic-kie
Quick start
Loading a PDF
PDFLoader is the main entry point. It handles file I/O, detects whether the document has a native text layer, and returns an immutable PDFDocument ready for downstream use.
from pathlib import Path
from agentic_kie import PDFLoader
loader = PDFLoader()
doc = loader.load(Path("invoice.pdf"))
# Access the full document text
print(doc.full_text)
# Navigate by page (zero-indexed, half-open ranges)
print(doc.read_text(0, 3)) # pages 0, 1, 2
print(doc.read_text(4)) # page 4 only
# Render pages to base64-encoded PNG strings (for vision models)
images = doc.all_images # all pages
first_page = doc.load_images(0) # single page
PDFDocument exposes:
| Attribute / Method | Description |
|---|---|
page_count |
Total number of pages |
is_ocr |
True if text was extracted via OCR |
full_text |
All pages concatenated with double newlines |
read_text(start, end=None) |
Text slice over a page range |
all_images |
All pages as base64 PNGs (cached) |
load_images(start, end=None) |
Image slice over a page range |
Scanned documents and OCR
For scanned PDFs, PDFLoader automatically detects the absence of a text layer and routes to an OCR provider. Any object implementing extract_text(image: bytes) -> str qualifies — no subclassing required.
from agentic_kie import PDFLoader, OCRProvider
class TextractProvider:
def extract_text(self, image: bytes) -> str:
# call AWS Textract (or any OCR service)
...
loader = PDFLoader(ocr_provider=TextractProvider())
doc = loader.load(Path("scanned_form.pdf"))
print(doc.is_ocr) # True
print(doc.full_text)
The dpi and text_threshold parameters let you control rendering resolution and the sensitivity of the native-text detection heuristic:
loader = PDFLoader(
ocr_provider=TextractProvider(),
dpi=300, # higher DPI improves OCR accuracy on dense documents
text_threshold=50, # minimum avg characters/page to skip OCR
)
Error handling
All document-level failures raise from a common DocumentLoadError base, making them easy to catch together or individually:
from agentic_kie import (
DocumentLoadError,
CorruptDocumentError,
PasswordProtectedError,
EmptyDocumentError,
OCRNotConfiguredError,
)
try:
doc = loader.load(path)
except PasswordProtectedError:
print("Document is encrypted")
except OCRNotConfiguredError:
print("Scanned document detected — provide an OCR provider")
except DocumentLoadError as e:
print(f"Load failed: {e}")
Extraction strategies
The extraction layer is under active development. Two strategies are planned:
- Single-pass: issues one structured prompt and parses the response directly against a Pydantic schema. Fast and predictable; suitable for well-structured documents.
- Agentic: a LangChain-powered agent loop that can reason iteratively, call tools, and refine its output over multiple steps. Better suited for complex or ambiguous documents.
Both strategies will accept a PDFDocument and a user-defined Pydantic schema, and return a validated extraction result.
Contributing
See CONTRIBUTING.md for development setup, available make targets, and the CI/CD pipeline.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agentic_kie-0.3.0.tar.gz.
File metadata
- Download URL: agentic_kie-0.3.0.tar.gz
- Upload date:
- Size: 3.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2a2ad7ac872a5a7f5c792a164f60c5acfe6cbb3995df3d48515327ca67068400
|
|
| MD5 |
f3c3c2d87d35f64f954fa41060377aa4
|
|
| BLAKE2b-256 |
396960be7149a870e18d71c2b3a23ca9e4ae3c810968de9b6ced79a3f0735c92
|
Provenance
The following attestation bundles were made for agentic_kie-0.3.0.tar.gz:
Publisher:
cd.yml on gafnts/agentic-kie
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agentic_kie-0.3.0.tar.gz -
Subject digest:
2a2ad7ac872a5a7f5c792a164f60c5acfe6cbb3995df3d48515327ca67068400 - Sigstore transparency entry: 1191737820
- Sigstore integration time:
-
Permalink:
gafnts/agentic-kie@b24667c385db9e933d02513f195cd2afc71f0737 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/gafnts
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
cd.yml@b24667c385db9e933d02513f195cd2afc71f0737 -
Trigger Event:
push
-
Statement type:
File details
Details for the file agentic_kie-0.3.0-py3-none-any.whl.
File metadata
- Download URL: agentic_kie-0.3.0-py3-none-any.whl
- Upload date:
- Size: 12.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6f78c629440162ec558fd30c6529b36518266f8310f4d4d0cd6b90c330b98137
|
|
| MD5 |
a254b7f17aca164016af86370469a82b
|
|
| BLAKE2b-256 |
2899796d46205dde7495747383e32b4fa6df4d2fb696b9bb33313ae201ce4aa5
|
Provenance
The following attestation bundles were made for agentic_kie-0.3.0-py3-none-any.whl:
Publisher:
cd.yml on gafnts/agentic-kie
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agentic_kie-0.3.0-py3-none-any.whl -
Subject digest:
6f78c629440162ec558fd30c6529b36518266f8310f4d4d0cd6b90c330b98137 - Sigstore transparency entry: 1191737825
- Sigstore integration time:
-
Permalink:
gafnts/agentic-kie@b24667c385db9e933d02513f195cd2afc71f0737 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/gafnts
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
cd.yml@b24667c385db9e933d02513f195cd2afc71f0737 -
Trigger Event:
push
-
Statement type: