Skip to main content

Agentic and single-pass Key Information Extraction (KIE) from documents using LLMs

Project description

Agentic KIE: LLM-Based Key Information Extraction from Documents

CI CD codecov PyPI License: MIT

A Python package for extracting structured information from PDF documents using large language models.

agentic-kie handles the full extraction pipeline: it loads PDFs (including scanned documents via a pluggable OCR backend), and exposes both the raw text and rendered page images so that LLMs can reason over document content using text, vision, or a combination of both. Two extraction strategies are available — a fast single-pass approach and a more capable agentic loop — designed for use in production pipelines and research workflows alike.

Contents


Installation

Requires Python 3.13 or later.

pip install agentic-kie

Or with uv:

uv add agentic-kie

Quick start

Loading a PDF

PDFLoader is the main entry point. It handles file I/O, detects whether the document has a native text layer, and returns an immutable PDFDocument ready for downstream use.

from pathlib import Path
from agentic_kie import PDFLoader

loader = PDFLoader()
doc = loader.load(Path("invoice.pdf"))

# Access the full document text
print(doc.full_text)

# Navigate by page (zero-indexed, half-open ranges)
print(doc.read_text(0, 3))   # pages 0, 1, 2
print(doc.read_text(4))      # page 4 only

# Render pages to base64-encoded PNG strings (for vision models)
images = doc.all_images          # all pages
first_page = doc.load_images(0)  # single page

PDFDocument exposes:

Attribute / Method Description
page_count Total number of pages
is_ocr True if text was extracted via OCR
full_text All pages concatenated with double newlines
read_text(start, end=None) Text slice over a page range
all_images All pages as base64 PNGs (cached)
load_images(start, end=None) Image slice over a page range

Scanned documents and OCR

For scanned PDFs, PDFLoader automatically detects the absence of a text layer and routes to an OCR provider. Any object implementing extract_text(image: bytes) -> str qualifies — no subclassing required.

from agentic_kie import PDFLoader, OCRProvider

class TextractProvider:
    def extract_text(self, image: bytes) -> str:
        # call AWS Textract (or any OCR service)
        ...

loader = PDFLoader(ocr_provider=TextractProvider())
doc = loader.load(Path("scanned_form.pdf"))

print(doc.is_ocr)    # True
print(doc.full_text)

The dpi and text_threshold parameters let you control rendering resolution and the sensitivity of the native-text detection heuristic:

loader = PDFLoader(
    ocr_provider=TextractProvider(),
    dpi=300,            # higher DPI improves OCR accuracy on dense documents
    text_threshold=50,  # minimum avg characters/page to skip OCR
)

Error handling

All document-level failures raise from a common DocumentLoadError base, making them easy to catch together or individually:

from agentic_kie import (
    DocumentLoadError,
    CorruptDocumentError,
    PasswordProtectedError,
    EmptyDocumentError,
    OCRNotConfiguredError,
)

try:
    doc = loader.load(path)
except PasswordProtectedError:
    print("Document is encrypted")
except OCRNotConfiguredError:
    print("Scanned document detected — provide an OCR provider")
except DocumentLoadError as e:
    print(f"Load failed: {e}")

Extraction strategies

The extraction layer is under active development. Two strategies are planned:

  • Single-pass: issues one structured prompt and parses the response directly against a Pydantic schema. Fast and predictable; suitable for well-structured documents.
  • Agentic: a LangChain-powered agent loop that can reason iteratively, call tools, and refine its output over multiple steps. Better suited for complex or ambiguous documents.

Both strategies will accept a PDFDocument and a user-defined Pydantic schema, and return a validated extraction result.


Contributing

See CONTRIBUTING.md for development setup, available make targets, and the CI/CD pipeline.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentic_kie-0.3.0.tar.gz (3.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentic_kie-0.3.0-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file agentic_kie-0.3.0.tar.gz.

File metadata

  • Download URL: agentic_kie-0.3.0.tar.gz
  • Upload date:
  • Size: 3.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agentic_kie-0.3.0.tar.gz
Algorithm Hash digest
SHA256 2a2ad7ac872a5a7f5c792a164f60c5acfe6cbb3995df3d48515327ca67068400
MD5 f3c3c2d87d35f64f954fa41060377aa4
BLAKE2b-256 396960be7149a870e18d71c2b3a23ca9e4ae3c810968de9b6ced79a3f0735c92

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentic_kie-0.3.0.tar.gz:

Publisher: cd.yml on gafnts/agentic-kie

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file agentic_kie-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: agentic_kie-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 12.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agentic_kie-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6f78c629440162ec558fd30c6529b36518266f8310f4d4d0cd6b90c330b98137
MD5 a254b7f17aca164016af86370469a82b
BLAKE2b-256 2899796d46205dde7495747383e32b4fa6df4d2fb696b9bb33313ae201ce4aa5

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentic_kie-0.3.0-py3-none-any.whl:

Publisher: cd.yml on gafnts/agentic-kie

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page