Skip to main content

Petey — The Easy PDF Extractor

Project description

Petey

The Easy PDF Extractor.

pip install petey

Setup

Add your API key to a .env file:

OPENAI_API_KEY=sk-...

Or for Anthropic:

ANTHROPIC_API_KEY=sk-ant-...

Usage

petey extract --schema schema.yaml ./pdfs/ -o results.csv

Options: --model/-m (default: gpt-4.1-mini), --concurrency/-c (default: 10), --format/-f (csv/json/jsonl), --output/-o, --instructions/-i.

Schema

name: Invoice
fields:
  vendor:
    type: string
    description: Company name on the invoice
  amount:
    type: number
    description: Total amount due
  date:
    type: date
    description: Invoice date
  status:
    type: enum
    values: [Paid, Unpaid, Overdue]
    description: Payment status

Field types: string, number, date, enum (with or without values), array (with nested fields).

All fields are nullable — the LLM returns null for anything it can't find.

Set record_type: array at the top level for table extraction (multiple records per document).

Add instructions at the top level to append guidance to the system prompt.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

petey-0.1.1.tar.gz (9.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

petey-0.1.1-py3-none-any.whl (7.8 kB view details)

Uploaded Python 3

File details

Details for the file petey-0.1.1.tar.gz.

File metadata

  • Download URL: petey-0.1.1.tar.gz
  • Upload date:
  • Size: 9.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.5

File hashes

Hashes for petey-0.1.1.tar.gz
Algorithm Hash digest
SHA256 4b07ff2f35f4572a0344bc180dfa0549cb72fa9890dd203fbb1a8f8460d4c6bf
MD5 32e954c90fa72bad2d0d23f4cbb0d18f
BLAKE2b-256 a64258e16e3a24b504917e7f1f9e0f24905b8ceee8c88448a3e6ae3026543e85

See more details on using hashes here.

File details

Details for the file petey-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: petey-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 7.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.5

File hashes

Hashes for petey-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a8a1427f48a8cea8575031c9e4b493b7d38a017b8ee96e0d7b12442f53e6e3da
MD5 6434258b4e8b6cccc9a1948a63f3dc6d
BLAKE2b-256 e637a07e561f267eecb25ecd2553a4e4f8136b8e2c6ac7852769dae05f61c398

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page