Skip to main content

Agentic and single-pass Key Information Extraction (KIE) from documents using LLMs

Project description

Agentic KIE: LLM-Based Key Information Extraction from Documents

CI CD codecov PyPI License: MIT

A Python package for extracting structured information from PDF documents using large language models.

agentic-kie handles the full extraction pipeline: it loads PDFs (including scanned documents via a pluggable OCR backend), and exposes both the raw text and rendered page images so that LLMs can reason over document content using text, vision, or a combination of both. Two extraction strategies are available — a fast single-pass approach and a more capable agentic loop — designed for use in production pipelines and research workflows alike.

Contents


Installation

Requires Python 3.13 or later.

pip install agentic-kie

Or with uv:

uv add agentic-kie

Quick start

Loading a PDF

PDFLoader is the main entry point. It handles file I/O, detects whether the document has a native text layer, and returns an immutable PDFDocument ready for downstream use.

from pathlib import Path
from agentic_kie import PDFLoader

loader = PDFLoader()
doc = loader.load(Path("invoice.pdf"))

# Access the full document text
print(doc.full_text)

# Navigate by page (zero-indexed, half-open ranges)
print(doc.read_text(0, 3))   # pages 0, 1, 2
print(doc.read_text(4))      # page 4 only

# Render pages to base64-encoded PNG strings (for vision models)
images = doc.all_images          # all pages
first_page = doc.load_images(0)  # single page

PDFDocument exposes:

Attribute / Method Description
page_count Total number of pages
is_ocr True if text was extracted via OCR
full_text All pages concatenated with double newlines
read_text(start, end=None) Text slice over a page range
all_images All pages as base64 PNGs (cached)
load_images(start, end=None) Image slice over a page range

Scanned documents and OCR

For scanned PDFs, PDFLoader automatically detects the absence of a text layer and routes to an OCR provider. Any object implementing extract_text(image: bytes) -> str qualifies — no subclassing required.

from agentic_kie import PDFLoader, OCRProvider

class TextractProvider:
    def extract_text(self, image: bytes) -> str:
        # call AWS Textract (or any OCR service)
        ...

loader = PDFLoader(ocr_provider=TextractProvider())
doc = loader.load(Path("scanned_form.pdf"))

print(doc.is_ocr)    # True
print(doc.full_text)

The dpi and text_threshold parameters let you control rendering resolution and the sensitivity of the native-text detection heuristic:

loader = PDFLoader(
    ocr_provider=TextractProvider(),
    dpi=300,            # higher DPI improves OCR accuracy on dense documents
    text_threshold=50,  # minimum avg characters/page to skip OCR
)

Error handling

All document-level failures raise from a common DocumentLoadError base, making them easy to catch together or individually:

from agentic_kie import (
    DocumentLoadError,
    CorruptDocumentError,
    PasswordProtectedError,
    EmptyDocumentError,
    OCRNotConfiguredError,
)

try:
    doc = loader.load(path)
except PasswordProtectedError:
    print("Document is encrypted")
except OCRNotConfiguredError:
    print("Scanned document detected — provide an OCR provider")
except DocumentLoadError as e:
    print(f"Load failed: {e}")

Extraction strategies

The extraction layer is under active development. Two strategies are planned:

  • Single-pass: issues one structured prompt and parses the response directly against a Pydantic schema. Fast and predictable; suitable for well-structured documents.
  • Agentic: a LangChain-powered agent loop that can reason iteratively, call tools, and refine its output over multiple steps. Better suited for complex or ambiguous documents.

Both strategies will accept a PDFDocument and a user-defined Pydantic schema, and return a validated extraction result.


Contributing

See CONTRIBUTING.md for development setup, available make targets, and the CI/CD pipeline.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentic_kie-0.2.0.tar.gz (3.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentic_kie-0.2.0-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file agentic_kie-0.2.0.tar.gz.

File metadata

  • Download URL: agentic_kie-0.2.0.tar.gz
  • Upload date:
  • Size: 3.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agentic_kie-0.2.0.tar.gz
Algorithm Hash digest
SHA256 c2bbcaf47b05c823828be0fb48ec1f58e52a2ba3d1dd4a6c774b63bf8b3d326f
MD5 f4042791b38ce9ad4f1d8c8ce6e6990a
BLAKE2b-256 bd7982528107efc96b0c118bc24745b07dd506a962922d3419ac2f40ea4a5f1d

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentic_kie-0.2.0.tar.gz:

Publisher: cd.yml on gafnts/agentic-kie

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file agentic_kie-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: agentic_kie-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 10.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agentic_kie-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cf3c3f4db9dce420721073fb27a65ec2edc439fdbf2c9c8a1e4754c7bc9c951b
MD5 aa3e19fad3b7dcad848297a97536e35a
BLAKE2b-256 0fe7a88c6c001565a65388117daef9b881acc83f2005919c17051f6d124e0f94

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentic_kie-0.2.0-py3-none-any.whl:

Publisher: cd.yml on gafnts/agentic-kie

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page