Skip to main content

Extract structured data from documents, images, audio, and video using LLMs

Project description

openextract

Extract structured data from documents, images, audio, and video using LLMs.

Installation

uv add openextract

Or

pip install openextract

Usage

from pydantic import BaseModel
from openextract import extract

class PdfInfo(BaseModel):
    summary: str
    language: str

result = extract(
    schema=PdfInfo,
    model="openai:gpt-5.4",
    input_file="https://example.com/document.pdf",
    instructions="return a 2 sentence summary and the primary language of the document",
)
print(result)

Changelog

See CHANGELOG.md for release history.

Contributing

See CONTRIBUTING.md for development setup and guidelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openextract-0.3.2.tar.gz (75.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openextract-0.3.2-py3-none-any.whl (4.4 kB view details)

Uploaded Python 3

File details

Details for the file openextract-0.3.2.tar.gz.

File metadata

  • Download URL: openextract-0.3.2.tar.gz
  • Upload date:
  • Size: 75.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for openextract-0.3.2.tar.gz
Algorithm Hash digest
SHA256 5d36279b5bfa9890ff81e7a63b762a05ee5bc20b4557b66c5e43afe32a47ff54
MD5 b956c16379fbb1c12fbb75f9b1c359b2
BLAKE2b-256 74b11bb57a51eea822f1514d2d22eb4508c62f322046744f45e125792633c39a

See more details on using hashes here.

Provenance

The following attestation bundles were made for openextract-0.3.2.tar.gz:

Publisher: release.yml on Mellow-Artificial-Intelligence/openextract

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file openextract-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: openextract-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 4.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for openextract-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e9a5510f11efd6a332de22a045f9fc2939b4c8f08314e2feda3c90b8843dc5ec
MD5 d2d15f09de5a1f63f886faed139aa69e
BLAKE2b-256 73cc6a356be81a24617799b809e06418670134666551ff55bd609782fd883ec2

See more details on using hashes here.

Provenance

The following attestation bundles were made for openextract-0.3.2-py3-none-any.whl:

Publisher: release.yml on Mellow-Artificial-Intelligence/openextract

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page